Edited by 73 BY) 


Donald G. Paterson 


University of Minnesota 


i^ 
| 
— 


Consulting Editors 


UL S. Асиплез, Psychological Corporation; WAurEK E. BINGHAM, Washington, D. C.; HAROLD 
Burtt, Ohio Slate University; ARTHUR І. Gates, T. C. Columbia University; IRVING LORGE, T. 
Columbia Шым, Quinn McNemar, S'anford University; WiLLARD C. N, Соле 

igan; JAMES P."PORTER, Swarthmore; Pennsylvania; EDWARD К. STRONG, JR., Stanford Uni- 
M nia; Јозери ZUBIN, N, У. Psychiatric Institule. 


ORRIS S. VirELES, University of Penns: 


= " 
* 
` 


Volume 32, 1948 


(QS 


blished Bi-monthly by The M s¥fhological Association, Inc. - 


| 
се and Lemon Sts., Lancaster, Pa., and 1515 usetts Aves, NW, Washington 5, D. c. 
as second-class matter, August 19, 1943, at tbe post office at Lancaster, Pa., under the act of March 3, 1879 
Acceptance i f rovided for in the Act of February 28, 1925, 
for ailing at the fecil L rate of PSP P PrE R., authorized October 10, 1947 


Copyright, 1948, by The American Psychological Association, Inc. 


1. 


Contents of Volume 32 


Articles 


Altus, W. D. A College Achiever and Non-Achiever Scale for the © 
Minnesota Multiphasic Personality Іпуепіогу................ 385 
Ash, P. The “Liberalism” of Congressmen Voting For and Against 
tie ТАДА Act! Еа LI TE LODS cote 636 
Ash, P. The Reliability of Job Evaluation Rankings............. 313 
Barkley, K. L. Influence of College Science Courses on the De- 
velopment of Attitude Toward Етоіийоп..................... 200 
Bass, B. M. Application of Addends to Sales and Clerical Occupa- 
Nona Cidicalion 1.55257 eel се oe ecco ete TAA EV S 490 
Beamer, G. C., Edmonson, L. D. and Strother, G. B. Improving 
the Selection of Linotype Ттайпеез.......................... 130 
Bellows, В. M. and Estep, M. Е. Job Evaluation Simplified: The 
- A Utility of the Occupational Characteristics Check List........... 354 
Bowles, J. W., Jr. and Pronko, N. H. Identification of Cola Bever- 
ages: П. A Further Бшду.... 2. о... а... ел .енсневннинннн не 559 
Brimhall D. В. and Otis, А. S. Consistency of Voting by Our 
Congressmen... cuoc сенке ну Use Raise realen 
Brown, C. W. and Ghiselli, E. E. Accident Proneness Among 
Street Car Motormen and Motor Coach Operators. ........... 20 
Brown, Н. 8. Similarities and Differences in College Populations on 
the Multiphatic........ eese eee. t RAI v no rn ew cs se 541 
Chesler, D.J. Reliability of Abbreviated Job Evaluation Scales.... 622 
Chesler, D. J. Reliability and Comparability of Different Job 


е MES хи о ooo РА 465 
Clark, K. E. Opinions of Residents Toward an Industrial Nuisance 435 
Coffin, T. E. Television's Effects on Leisure-Time Activities. ..... 550 
Corsini, R. J. The Pin Prick Method of Secret Balloting......... 641 
Diamond, 8. The Interpretation of Interest Profiles. ............ 512 
Dietsch, В. W. and Gurnee, Н. Cumulative Effect of a Series of 

Canipaign 1вайаїв- MEETS nous ор. ESL 189 
Dreffin, У. B. and Wrenn, С. G. Spatial Relations Ability and 

Other Characteristics of Art Laboratory Students.............. 601 
Eckerman, A. C. An Analysis of Grievances and Aggreived Em- 

ployees in a Machine Shop and Еошпйгу..................... 255 
Eckstrand, G. and Gilliland, A. R. The Psychogalvanometric 


Method for Measuring the Effectiveness of Advertising. ........ 415 


ш 


iv Contents of Volume 32 


Edwards, A. L. and Kilpatrick, F. P. A Technique for the Con- 


struction of Attitude Ѕсаіеѕ................................ 374 
Edwards, А. 5. The Effect of Smoking on Tremor.............. 150 
Edwards, R. 8. Words Are Рупашіќе......................... 370 
Flesch, В. А New Readability ҮагіѕйсК....................... 221 
Gage, N. L. and Remmers, H. H. Opinion Polling with Mark- 

Manod Ponch Сагай. 5.0 dares. codd o whee Re ERR e RD 88 
Ghiselli, E. E. and Brown, C. W. The Effectiveness of Intelligence 

Tests in the Selection of Могкегѕ.......................... 575 
Giese, W. J. How Better Personnel Selection Can Reduce Factory 

RRS о о sg Io E EN LAE pucri nl 344 
Giese, W. J. and Weigle, F. Evaluation of a Clerical Applicant Test- 

ing Program. ...2.-. e ue repens нн T iet A see Hem а ina 581 


Grether, W. F. Factors in the Design of Clock Dials Which Affect 
Speed and Accuracy of Reading in the 2400-Hour Time System.. 159 


Grigg, A. E. А Farm Knowledge Тезї......................... 452 
Guilford, J. P. and Zimmerman, W. 5. The Guilford-Zimmerman 
Aptitude Survey. o.c eere eere mre dds ve касна 24 
Hay, E. N. Creating Factor Comparison Key Scales by the Per 
Cen tu MP X. со а оваа ribs digas eremi i as 456 
Henrikson, E. H. A Study of Stage Fright and the Judgment of 
Speaking Time. у. odierna pur ees sk 532 


Herrmann, M. and Hackman, R. B. Distributions of Scores on the 
Wechsler-Bellevue Scales and the California Test of Mental 


Maturity at a V. A. Guidance Сепќег.......:..............:. 642 
Jarrett, R. F. Per Cent Increase in Output of Selected Personnel 
as an Index of Test Е сіепсу.............................. 135 


John, E.R. Inter-Relationships of Selected Personnel Functions... 146 
Jones, A.M. Job Evaluation of Nonacademic Work at the Univer- 


нао ЗОИРОВ Ее 15 
Jurgensen,C.E. Norms for the Test of Mechanical Comprehension 618 
Kephart, N. C. Visual Skills and Labor Turnover.............. 51 
Kerr, W. A. On the Validity and Reliability of the Job Satisfaction 

Tear Ballot: C аи AOR аА ЕАН 275 
Knauft, Е. В. Construction and Use of Weighted Check-List 

Rating Scales for Two Industrial Situations.................. 63 


Kryter, K.D. Effects of High Altitude on Speech Intelligibility. ... 503 
Lauro, L. A Note on Machine Scoring the Kuder Preference Rec- 

ө 1. oo due МИА ова еы ANE фи 629 
Lawshe, C. H., Jr., Dudek, E. E., and Wilson, R. F. Studies of Job 

Evaluation. 7. A Factor Analysis of Two Point Rating Methods 

OF Job Evaluation 52 2 оН OCC ти 118 


B 
Contents of Volume 32 v 
Laybourn, G. P. and Longstaff, Н. P. College Students’ Opinions | 


of Radio Айтегівіц................ nnn 81 
Link, H. C. The Ninety-Fourth Issue of the Psychological Barom- 
eter and а Note on Its Fifteenth Аппіуегѕагу................ 105 


Link, Н. C. and Freiberg, A. D. The 97th Psychological Barometer. 443 
Longstaff, H. P. Fakability of the Strong Interest Blank and the 


Kuder Preference Record.......... e n 360 
Lovell, G. D., Davis, H. and Meacham, A. A Validating Study of 
the Work Preference Inventory... 195 
MacMillan, M. Н. and Rothe, Н. F. Additional Distributions of 
Test Scores of Industrial Employees and Applicants........... 270 
McElheny, W. T. A Study of Two Techniques of Measuring 
“Mechanical Сотргеһепвїоп”.......................-+.- "61 
Miles, R. W. A Proposed Short Form of the Kuder Preference 
йаш ыг... л ЛУ с а eee а Urso E E IEEE E с 282 
Nahm, Н. Satisfaction With Nursing.......... n6 335 
Newman, S. H. and Bobbitt, J. M. The Development of Entrance 
Tests for the United States Coast Guard Academy..........--- 248 
Paterson, D. G. and Jenkins, J. J. Communication Between Man- 
agement and Workers... t 71 


Perloff, E. Prediction of Male Readership of Magazine Articles.... 668 
Pronko, N. H. and Bowles, J. W., Jr. Identification of Cola Bever- 
ages, I. First 5шшйу.................+ nn 304 
Recktenwald, L. N. Certain Factors Bearing on the Cleeton Voca- 
tional Interest Inventory. ...... e t 527 
Richardson, H. M. Adult Leadership Scales Based on the Bern- 
reuter Personality Inventory... 292 
Roscoe, 8. N. The Effects of Eliminating Binocular and Peripheral 
Monocular Visual Cues Upon Airplane Pilot Performance in 


Landiüg у у-у oie rer E ERR rit ЕДА Че errat aig ain а Whe 649 
Sleight, В. B. The Effect of Instrument Dial Shape on Legibility.. 170 
Sleight, В. B. and Tiffin, J. Industrial Noise and Hearing....... 476 
Smith, M. Cautions Concerning the Use of the Taylor-Russell 

Tables in Employee Ѕеіесіоп..........................: ‚... 595 
Speer, С. 8. The Kuder Interest Test Patterns of Fire Protection 

Епріпеегв:.... ee nnn rhe hh hene tate pete 521 


Stogdill, R. M. and Shartle, C. L. Methods for Determining Pat- 
terns of Leadership Behavior in Relation to Organization Struc- 
ture and Objectives..... c.c esses cece тя 286 

Stone, С. Н. and Simos, I. A Follow-up Study of Personal Counsel- 
ing Versus Counseling by Гейег........................ 408 


vi 5 Contents of Volume 32 


Strange, J. В. and Sartain, А. Q. Veterans’ Scores on the Purdue 
ЖОШО Test. .... So еи AE PRT Тао ои i 
Sulzman, J. H., Cook, E. B. and Bartlett, N. R. The Validity and 
Reliability of Heterophoria Scores Yielded by Three Commercial 
Onticat Бетсен . ааа е eure een enne RR 
Taft, R. Use of the “Group Situation Observation" Method in the 
Selection of Trainee Ехесийү+ез............................. 
Thompson, О. M. College Grades and the Group Rorschach. .... 
Tiffin, J. and Asher, E. J. The Purdue Pegboard: Norms and 
Studies of Reliability and Үаійіу........................... 
Tinker, M. A. Cumulative Effect of Marginal Conditions Upon 
Rate of Perception ш Reading... ........ erc wee ol ecw 
Weinland, J.D. The Use of Rating Scales and Personal Inventories 
HECHbcle Rach-Dther НРА Retenir rS 
Wylie, R. C. Notes on the Validity of the Grove Modification of 
the Kent-Shakow Industrial Formboard Series............... 
Wylie, R. C., Wilson, A. W. and Grove, W. R. High School Norms 
for the Grove Modification of the Kent-Shakow Formboard 


Book Reviews 


Adkins, et al. Construction and Analysis of Achievement Tests: 
ВВ Бе та пина у И dash ари ы ои EM 
Blankenship's How to Conduct Consumer and Opinion Research. 
'The Sampling Survey in Operation: Arthur W. Kornhauser...... 
Bonnardell’s L'adaptation de L'homme à Son Métier: Josef Brozek. . 
Brodman's Men at Work: William А. McClelland................ 
Braun's Fair Thought and Speech: Robert N. MeMurry 
Cantor's Dynamics of Learning: Ernest R. Hilgard............... 
Churchman, Ackoff, and Murray's Measurement of Consumer 
iEn vexeste AM red еер. сена o eer rere ira e I ЕЛИ 
Civil Service Assembly's Placement and Probation in the Public 
Бегово Prod. E bu VT + caf Rec dte et oes 
Crawford and Burnham's Forecasting College Achievement. A 
Survey of Aptitude Tests for Higher Education, Part I. General 
Considerations in the Measurement of Academic Promise: 
Harold D^ GHEUGD а реу та ЖЕК d sian 207. 
Davis's Item-Analysis Data, Their Computation, Interpretation, 
and Use in Test Construction: Charles I. Мовјег.............. 
DeGruehy's Creative Old Age and Lawton's Aging Successfully: 
«че орнар КСЕНОН LE oaa DE e E EIS a oe ERE 


Franziska's Die Psychologie der Menschenbehandlung im Betriebe: 
Michael Joelson 


Contents of Volume 32 е vii 
Gordon, Densford, and Williamson's Counseling in Schools of 


Nursing: Helen Nabm............ eee I +++ 326 
Gray, et al. Psychology in Human Affairs: Harold E. Burtt...... 95 
Gregg, et al. The Place of Psychology in an Ideal University: 

Walter У. Bingham ды.............5- +5 ewan v rx FRENTE ERE 321 
Jueius, Maynard and Shartle's Job Analysis for Retail Stores: 

William A; MoOlolland «onc. ee errore ага ete m ИЕ 679 
Kelly's New Methods in Applied Psychology: Josef Brozek........ 675 
Lazarsfeld and Field's The People Look at Radio: Alfred C. Welch... 210 
Leeper's Psychology of Personality: Horace B. English............. 571 
Lytle's Job Evaluation Methods: C. H. Lawshe, Jr... ......... 209 
Maier's Psychology in Industry: A Psychological Approach to 

Industrial Problems: Kenneth A. МіШага.................... 94 
Maslow's The Analysis and Control of Human Experiences: Julian 

preciso og EDEN TEC EL S 100 
Missiuro’s Znuzenie, О Fizjologieznych Podstawach Racjonalizacji 

Pracy: Josef Brozek: ess 22.00. а.н ена нЕ: 571 
Moncrieff’s The Chemical Senses: Josef ВголеК.................. 432 


Moore, Kennedy, and Castore's The Work, Training and Status of 
Supervisors as Reported by Supervisors in Industry: Henry L. 


ВЖЕ ИЫ О ETUDES ETE LOADS X 680 
Mursell's Psychological Testing: W. Grant Dahlstrom............ 679 
Robinson’s Effective Study: C. ФА. СегКкеп..................... 97 
Ryan's Work and Effort: The Psychology of Production: Albert 8. 

"Тодарава жй аж Ыз ыл ro sni ада ов ejt 328 


Shartle’s Vocational Counseling and Placement in the Community 
in Relation to Labor Mobility, Tenure, and Other Factors: 


Arthur Н. Вгаубеій..: „2. eere hee tere 568 
Society for the Advancement of Management's Selection of Sales 

Personnel and Aptitude Testing: Robert M. Thomson......... 210 
Tiffin’s Industrial Psychology: Harold Е. Кофе. ................ 565 


Tyler’s The Psychology of Human Differences: Lewis M. Terman.. 216 
Westover's Controlled Eye Movements Versus Practice Exercises in 


Reading: Miles A. Tinker. -ameer eee ry 214 
Zeisel's Say It With Figures: Kenneth E. Clark.................. 326 
Miscellaneous 
New Books, Monographs, and Pamphlets. 102, 218, 332, 433, 573, 682 
NOWE ла репе ехо a) sao +. a's а LIA Te e o 92 


Journal of Applied Psychology 


——MM———— 


Vol. 32, No. 1 Ри February, 1948 


Consistency of Voting by Our Congressmen 


Dean R. Brimhall and Arthur S. Otis 
Washington, D. C. 


Can we predict from the past record of a congressman how he will 
vote in the future? The answer is yes—to а very marked degree. 

"This answer is based on a careful study of the votes cast by 512 con- 
gressmen on important propositions during the past five years. 

Let us say that & congressman's position on a scale from liberal to con- 
servative is represented by assigning him a scale value from 1 to 7, the 
value 1 representing the most liberal position and 7 the most conservative 
position, as shown in Figure 1. Then we may say, briefly, that if each 

Liberal Conservative 
lO Me ара ЖИЕ 96 MCI 1 


Fic. 1. Showing seven positions along а scale 
from liberal to conservative. 
congressman is given a scale value each year to represent his position on 
the scale for that year, there are 46 chances in 100 that his scale value 
will not change from one year to the next; there are 83 chances in 100 
that his scale value will not change more than 1 unit from one year to the 
next; and there are 95 chances in 100 that his scale value will not change 
more than 2 units. The maximum change possible would be 6 units. 
Source of the Data 
The data upon which the study was based were reported in the New 
Republic. In various past issues the vote of each congressman is given 
for each of a number of important propositions on which a roll call was 
had. 
In the case of each proposition the vote of each congressman was 
recorded as follows: 
+ = а "progressive" vote, according to the opinion of the 
editors of the New Republic; 
— = an “anti-progressive” vote; 
0 = absence or “paired.” 


1 


2 Dean R. Brimhall and Arthur S. Otis 


For example the votes of the Arkansas senators on the 10 propositions 
reported on August 4, 1947, were recorded as shown in Table 1. 


Table 1 
Manner of Reporting Votes 
к ш ——- 
Proposition 1 2 3 4 5 6 7 8 9 10 


VD UU ER БАЈА EX nos S PEREAT dell e 
Arkansas 


Fulbright ds oom АУ + 0 + + t 

MeClelli а clears и mugs Fu mide 
SEEN ке оет G ЫА, МА у Ss bes d! ЫНА ЧИ ааа уг с 

The table shows that in the opinion of the editors, Senator Fulbright 
voted “progressively” on Proposition 1, voted “anti-progressively” on 
Proposition 2, voted "progressively" on Propositions 3, 4, and 5, was 
absent or paired on Proposition 6, etc. 


For convenience the data were grouped and designated as shown in 
Table 2. 


Table 2 
Source of Data 
———ЄЄ 
Number of Propositions 


) Designation 

Period Dates of Reports Senate House Adopted 
I May 8, 1944 18 18 “1944” 
eb. 4, 1945 5 6 “ » 

Шы: oan 11, 1946 ‘ep 10 1995 
ш Sept. 23, 1946 14 15 “1946” 
IV Aug. 4,1947 10 10 “1947” 

Total 57 59 


M G a E 


Since the propositions reported upon were not the same for the House 
and Senate, a separate study was made for each legislative body. The 
study of the voting of senators is described first. 


“Progressiveness”’ Expressed as a Per Cent 


First, a percentage rating was obtained for each senator for each 
period by finding the number of votes cast by a senator during that 
period and then finding what percentage of these were "progressive" 
votes. 

For example, if, during Period I, а senator voted on 15 of the 18 
propositions and voted “progressively” on 12 of these 15, his percentage 
ratings for that period was 12/15, or 80%. 


, 


Consistency of Voting by Our Сопдтеззтећ 3 


Assigned scale values. The percentage rating of each senator for 
each of the four periods was then converted into a scale value according 
to Table 3. 


Table 3 
" Scale Values 
* Highest seventh of the percentages. . ама 9.2.7 1 
Next highest seventh percentages... nn ns 2 
Next highest seventh регсепбадев...................- е 3 
Next highest seventh percentages... 4 
Next highest seventh percentages.......--- ++ n 5 
Next highest seventh percentages... 6 
Lowest seventh of the percentages... 7 


То assign these scale values it was necessary to find the distribution 
of percentage ratings and so arrange them in order. 

For purposes of assigning these scale values, use was made of the per- 
centage ratings of only those senators who were present in all four periods. 
There were 57 such senators, as may be determined from Table 4. 

The highest seventh of the 57 percentage ratings in Period I were, 
respectively, 100%, 100%, 94%, 94%, 94%, 94%, 93%, and92%. These 
were therefore assigned the scale value 1. The percentages constituting 
the next highest seventh were 85%, 81%, 76%, 76%, 76%, 75%, 719%, 
and 63%. These were assigned the scale value 2. Those constituting 
the next seventh were assigned the scale value 3, and so on. 

The scale values for the four periods for the various senators are shown 
in Table 4. This table contains the names of only those senators who 
were present during at least two of the four periods. The starred sena- 
tors are Republicans. 

In order to present a single comprehensive scale value for each such 
senator, Table 4 shows also the average scale value for each senator who 
was present in Period IV. 


Consistency in Voting 

A glance at Table 4 will show that there is a marked degree of con- 
sistency in the scale values. Note for example how Hayden of Arizona 
got 2, 2, 2, 2, for the four periods. 

To obtain a measure of consistency of voting, each scale value was 
compared with the one preceding, to see whether it was the same or 
whether it differed by one unit or two units or three units. 

Thus, Hill of Alabama shifted first from 3 to 2 (one “unit of change), 
then repeated the 2 (zero units of change) and then shifted from 2 to 1 
(one unit). In this way the 218 shifts were noted. They are shown in | 
Table 5. Е 


4 Dean R. Brimhall and Arthur S. Otis 


Table 4 
Scale Values of Senators 
——Є——Є- :} 
Де III ЈЕ ЕП IV 
1944 1945 1946 1947 Avg. 1944 1945 1946 1947 Avg. 
Alabama Kansas 
Hill 3 2 2 1 2 *Capper 4 4 6 6 6 
6 6 
MEOS *Reed [Ie x 
Hayden 27:27 а 20979 Kentucky 
MéPsHand- 2" 3 § $3" 3 Barkley 1 pug Y 1 
Arkansas Louisiana 
Fulbright a mp Ellender Ped M NM е 
McClellan 0, 14.4, У. а | Overton ПОЛА ur, 4763 
California Maine 
Downey 1 1 1 2 1 *Brewster A" 6. «5 7 6 
*Knowland guru 4 *White би psy 
Colorado Maryland 
Johnson arm 3 38 Radcliffe Ze Мт У 
*Millikin 77 7 5 6 'Tydings 674 5 3 4 
Connecticut Massachusetts 
MeMahon pe qug *Saltonstall 8 ПИЋЕ 
4 2 
Delaware ME 3 
*Buck (УМК. ТУ dan Michigan 
Tunnell as | *Ferguson M Leva, 4 "5 
- х 4 
Florida Vandenberg 3 5 4 5 
Andrews 9. 5 5 Minnesota 
Pepper 1 1 1 1 1 *Ball [aS QUERY 4 4 
Georgia *Shipstead Dx RM 
George NUM 5; XM Mississippi 
Russell 2 4 3 3 Bilbo 4 4 5 7 5 
Idaho Eastland ра 5. oS 
Taylor 1 ripa Missouri 
Illinois *Donnell Ms ОК 1 
*Brooks 677 Bue FM We Montana 
Lucas 25." 3.“ СИА | Murray 1 1 1 1 1 
Indi Wheeler 5 3 
*Capehart Rohs б Nebraska 
*Willis 7 26S *Butler eee V's 6 
uda *Wherry T2 5. ^4 6 6 
*Hickenlooper а“. Ваке Меуада 
*Wilson $614 8 McCarran 3 4 8 4 4 


Consistency of Voting by Our Congressrhen 5 
Table 4—Continued 


„—— 


I- X. HE iv our. HL JV 
1944 1945 1946 1947 Avg. 1944 1945 1946 1947 Avg. 
New Hampshire South Carolina 
*Bridges pe 6. 57 6 6 Johnston 4 2 3 
*Tobey € 4 S 3 3 Maybank 3 $. AP 3 
New Jersey South Dakota 
*Hawkes eigo rt *Bushfield и ga fo 7": 
*Smith $345: д *Gurney атте ТИБИ 
New Mexico "Tennessee 
Chavez Er... | 1 2 2 McKellar 5 феб 4 
Hatch 2 2 2 3 2 Stewart 3 4 4 5 4 
New York Texas 
Mead 1 i.d Connally DES WC Y О 
Wagner 1 1 1 1 H O'Daniel 6 TT, 0619 6 
North Carolina Utah 
Bailey 5 0 5 Murdock 1 Lite 
Hoey Моја а 5 Тһошаз 17 ОТИ 1 
North Dakota Vermont 
*Langer ча тё 23 *Aiken 8 2.9 08408 
*Young 6 5 5 *Austin 3 5i x 
Ohio Virginia 
*Taft | RES Є S vg 5 Byrd 6 бо "А 5 5 
Oklahoma Washington 
*Moore Tum 7 ава Magnuson Дог S T^ 
Thomas & eee REY ТҮҮ. Mitchell LoOX 
Oregon West Virginia 
*Cordon ВУ р Kilgore 1 1 1 1 1 
*Morse а oR aba *Revercomb 7 6 4 6 6 
Pennsylvania Wisconsin 
Guffey РО ДУ“ 1 LaFollette Б. 
Муегв 1. (aaa *Wiley ©, IDE LOS 
Rhode Island Wyoming 
Gerry 6 5 6 O'Mahoney 2 3 3 2 2 
Green io 6o ix^ *Robertson v A DET ve e i 


1 == 


Table 5 is read as follows. ‘There were 90 instances in which the shift 
was zero units; that is, in which the scale value remained the same as it 
was the year before. These 90 cases were 41% of the 218 shifts. There 
were 91 instances (42% of the 218) in which the shift in scale value was 
only one unit. The 41% and 42% together made 83% of cases in which 


6 Dean R. Brimhall and Arthur S. Otis 


Table 5 
Shifts in Scale Values of Senators from One Period to the Next 
Amount of Shift No. of Cases Per Cent Sub-total 
0 units 90 41 41 
1 unit 91 42 83 
2 units 34 16 99 
3 units 3 1 100 
Total 218 100 


the shift was not over one unit. There were 34 instances (16% of the 
.218) in which the shift was two units, making 99% of cases in which the 
shift was not over 2 units. In only 3 cases (about 1%) were the shifts 
as much as 3 units, as from 4 to 1. 

The following are the three senators who shifted 3 units: Overton of 
Louisiana, 6 to 3; Taft of Ohio, 4 to 7; and Cordon of Oregon, 4 to 7. 

The following senators maintained the same scale value throughout 
the four periods: 5 


Hayden, Arizona, 
Pepper, Florida, 
Lueas, Illinois, 
Reed, Kansas, 
Murray, Montana, 
Wagner, New York, 
Green, Rhode Island, 
Kilgore, W. Va., 


M momo с м н м 
м mM н "о од = ~ 
= н кою о № н > 
- ~ kK – ON н ~ 


The Representatives 


А. comparable study was made of the consistency of voting of the 
representatives. 

З The scale values of those representatives who were present during at 
least two periods are shown in Table 8, near the end of the article, to- 
gether with an average scale value for those who were present in the last 
period. The starred representatives are Republicans. 

The degree of consistency of voting by representatives is shown in 
Table 6 which is comparable to Table 5. 

Table 6 is read as follows: Of the 1076 shifts made by the repre- 
sentatives (in most cases 3 shifts by each), 504, or 47%, were zero units— 
no shift at all; 391, or 36%, were shifts of one unit, making 83%, of cases in 


„ which the shift was not over one unit; 127, or 12%, were shifts of two 
units, etc. 


Consistency of Voting by Our Congressmen T 
Table 6 
Shifts of Scale Values of Representatives from One Period to the Next 
Amount of Shift No. of Cases Per Cent Sub-total 

0 units 504 47 47 

1 unit 391 36 83 
2unis 127 12 95 

3 units 46 42 99.2 
4 units 7 0.7 99.9 
5 units 1 0.1 100.0 

"Total 1076 100.0 


The representative who made a shift of 5 units was Winstead of Mis- 
sissippi, who shifted from 7 in “1946” to 2 in “1947.” 

Table 7 shows the data of Tables 5 and 6 combined; that is, for con- 
gressmen of both houses together. 


'Table 7 
Shifts in Scale Values of Senators and Representatives Together 


Amount of Shift No. of Cases Per Cent Sub-total 
0 units 594 46 46 
lunit 482 37 83 
2 units 161 12 95 
3 units 49 4 99 
4 units 7 0.5 
5 units 1 0.1 


Table 7 shows that, as stated at the outset, there are 46 chances in 
100 that a congressman's scale value will not change at all from any given 
year to the next; there are 83 chances in 100 that his scale value will not 
change more than one unit; and there are 95 chances in 100 that his scale 
value will not change more than 2 units. 

The chances are less than 1 in 100 that a congressman's scale value 
will change more than three units out of a possible 6 units of change; and 
there is less than 1 chance in 1000 that а congressman's scale value will 
change 5 units. 


Reversing the Scale Values 


It goes without saying, no doubt, that the consistency of congressmen 
is the same whether we rate the most liberal congressman 1 and the most 
conservative 7, or whether we rate the most conservative 1 and the most 
liberal 7. A congressman whose scale value changed from 1 to 2 by the 


5 


8 Dean R. Brimhall and Arthur S. Otis 
Table 8 
Scale Values of Representatives 
1 Д. HE FV Io sah ШШ IV 
1944 1945 1946 1947 Avg. 1944 1945 1946 1947 Avg. 
Alabama Sheppard Bu 3^ 2. 
Andrews 5 | aut MORE" Tolan 2 2 1 
Boykin 4 6 7 6 6 Voorhis 1 1 2 
Grant £- B4 EE *Welch О 2^. 2 а 
Hobbs 4 (X CB Бета 
Jarman 3 8 8 3 3 |Colrado 
Manasco 4d E Vl vv *Chenoweth үз 6! 74 28 
Patrick 1 1 *Gillespie 5 4 
Rains РАДЊИ 2 US "ни ВИА УБ Ба 
Sparkman 8 2 2 *Rockwell 85305. 6. а. 
Arizona Connecticut 
Harless 1 2 2 2 23 Geelan 1 1 
Murdock т | Тус? t Kopplemann 1 1 
е 2 
дез лий г 1 1 
Cravens 6 65 7 4 4 
Р *Talbot 6 6 5 
Gathings Ба em 6 6 Даб е 1 1 
Harris 3 2 3 2 2 
Hays rans een рик 
Mills 3 3 4 3 3 Traynor 2 2 
Norrell 6 5 ae 85 5 s 
Trimble 3 2. Br 2 Florida 
California Cannon 3 6 3 
ер а "E HW EAM Hendricks 3 4 5 3 4 
Peterson 2c 4 8 34 
Douglas 1 1 с ные | А 4 
Doyle 1 1 Price 3 4 5 3 1 
Elliot (Pocos das "o d 
Engle 2 3 2 £m 
*Gearhart РААН рег S Georgia 
SA : : 1 1 Brown se «38 78 Б] : 
y Cam А 8. 8 
“Hinshaw 6 4 3 6 5 бох. 4 aay ee Vl 
M ў AME. 1 Gibson 5 6 6 i 
730 Расе а ЖК М. 
“Johnson, J.L. 4 4 3 4 4 Peterson 4 4 4 
M ч : i ч А Тагуег Е ^s А 
Vi 4 2 3 3 
*McDonough $ 3 C46 Wd УНА МЕ 
Miller 1 1 1 1 
Outland 1 1 1 Idaho 
Patterson 1 1 *Dworshak Twv 5 
*Phillips 5 7 6 6 6 White 2 38 3 


———————— нови ——— 9— 


$ 
Consistency of Voting by Our Congressmen 9 


Table 8—Continued 


IV IP LL НЕ ТУ 
1944 1945 1946 1947 Avg. 


£|- 
$n 
ЗЕ 
$ 
Ж 


Illinois *Jensen rd a REA, Ја и i 
*Allen uu T2 5 6 6 *LeCompte 5 oo), 9 6 
*Arends Br e dT 7 7 *Martin 6 T. .9. · 6 
*Bishop РБЕ УО? ху *Talle б” 2 01 2:159 
*Chipfield oT поре о 
*Church & T T 4 6 | Каа 

Dawson ЈУ“ gun сше 6 Н : PUT 
*Dirksen б dri ТАИ SOS. 6: V 
1 1 
саће ТЕЛ. ЧИЯ Rees 6. 6. ба TD 
Gorski ЕИ А ы н, 
-  *Howell ds (6 us | Winter ГЕ Е: 
*Johnson 7 7 7 6 7 | Kentucky 
Kelly 2 2 Bates 2g'1'3 Tim 
Link 1 1 Chapman 8 8 3 8 8 
*McMillen B. rt Chelf УАЙ Те 
*Mason КЛА ЖАМ Clements ду за D 
O'Brien 2 1 H 2 2 Gi 3 75765 2 
| Price 1 1 1 1 M 24 : TS. 
*Reed To T NB 7 7 O'Neal » aho ees 
Resa Ej *Robsion ГО. Ц 
Rowan ОДЕ Spence 1 ДАГАА 
Sabath 1 1 1 1 1 
*Simpson 6 6 5 4 5 | Louisiana 
*Sumner TUS T Allen Sr. 51 VS р 
*Vursell CREE Орао Brooks 8785 T RUE 
Doméneaux 4 5 7 5 5 
Indiana Hebert ГИБ ЗР ы 
| *Gillie 5. BUE Larcade Ир 
*Grant Де 83 30-8 McKenzie Bo» 
*Halleck 7 7 7 7 7 Мајопеу 2 3 8 
"Harness d Tid ; у 7 Morrison Bi Sh) X2 
*Johnson,N. 5 6 7 6 6 
*LaFollette 4-3. | Maine 
*Landis B. 46 (ба *Fellows (MD PR RO EDD 
Ludlow 24.79 1 *Hale 6 5 5 4 5 
Madden 1 1 1 1 1 *Smith 5 127 +" aX 
"Springer 6. 6 . 6: '& 
*Wilson Wear E ues | Mera 
win 5" DA | 

Iowa *Beall 65. буг биле 
‘Cunningham 4 6 6 6 6 D'Aleando 1 2 2 2 2 
*Dolliver 65 бана Fallon Si 2.5808 
*Gwynne B8. 75 Хало боб Roe 4 1 
*Hoeven а. FN IN РТ Sasscer 2.:8, 2.8088 


" 
10 Dean R. Brimhall and Arthur S. Otis 


Table 8—Continued 


DOL) Re AW a a XH IV 
1944 1945 1946 1947 Avg. 1944 1945 1946 1947 Avg. 
Massachusetts Mississippi 
*Bates 5 5 4 6 5 Abernethy 3 5 6 Б] 4 
*Clason 7 5 3 4 5 Colmer 5 4 4 3 4 
Curley 2 2 1 McGehee hte Pete’ fi 
*Gifford 7T i 6 7 7 Rankin 3 AN. dut 3 5 
*Goodwin Tw 6 + 6 Whitten SK op ми 4 5 
*Herter 4 4 Б] 4 4 Whittington 3 3 4 2 3 
*Heselton 4 Б] 4 4 Winstead 8 6 7 2 4 
*Holmes 5 5 5 3 у 
Тапе 1 1 2 2 2 | Misouri 
McCormack 1 1 1 1 "Arnold бф TP ET ST 
*Martin 6 6 6 Bell 5 4 4 4 4 
Philbin DAE ла rok: "Bennett Sue o day 
*Rogers $ TS A 8.26 Cannon sod l1. 27ка 
*Wigglesworth 6 5 4 6 5 Carnahan 11 
Cochran 1 1 1 
Michigan *Cole uo NE ee 
*Blackney C аи er n Ws *Ploeser T2076. 7-17 
*Bradley TU + OLN ау 6 *Schwabe Жы Жн, ЧЕГҮҮ, 
*Crawford 7 в 6 6 6 *Short А ов 
Dingell Io. dug med Slaughter 4 4 5 
*Dondero euer ОЕБС Sullivan la 1 
*Engel Sedo TES Zimmerman 2 3 2 83 3 
*Hoffman вето Коси 6 
Hook 1 1 Montana 
Jonkman LANDE 6 7 *D'Ewart 6 5 5 5 
Lesinski i-3j 1 1 1 Mansfield A mur FS 
*Michener 6 
O'Brien 2 1 1 j Nebraska 
Habbo. s И *Buffett Rb O АУТ 
+ 1 6 
БАЗАМА I dq 013, И | ues $1.44 E БЕ 
*Shafer 7 6 6 5 6 Miller 5 7 7 5 6 
*Wolcott ‘RA VR 19r 8 "Stefan r US cu o Є6 
*Woodruff LAM Re A ПВО Nevada 
Minnesota Bunker T. 3 
*Andersen LN D TE! : 
*Andeen 6 7 7 5 6 wn Sombie 
dams 5 5 
Gallagher Oe *Merrow a орла фи 5 
*Hagen Ф (ье! 
*Judd 4 8 3.8048 New Jersey 
*Knutson T 7T “Ман ee *Auchincloss 6 7 3 6 5 
*O’Hara #07. Te тих *Canfield 8:278 "Sow 3 
*Pittenger 4 6 6 *Case 4 3 4 4 
Starkey Mox *Eaton б 8 6 mei 6 


t 
Consistency of Voting by Our Congressmen 11 


Table 8—Continued 


p с IH IV [I d3P. HD dV 
1944 1945 1946 1947 Avg. 1944 1945 1946 1947 Avg. 
*Hand 4 8 5 + Quinn 1 1 
Hart 1 1 1 2 1 Rabin 1 1 1 1 
*Hartley TW T. 7 5 Rayfiel 1 1 1 1 
*Кеап 5 4 4 4 4 *Reed 7 "T Jo zj 
Norton M 6o 1 1 1 Roe 2 2 
‘Sundstrom 6 5 4 6 5 Rogers D» 
*Thomas Y. T vat 5 6 Rooney 1 | од 1 
*Towe 4$. 4. 0478.5 *Sharp 6 4 
*Wolverton 3 3 3 5 4 Somers 1 1:23" 9579 
Now Maxton *Taber 7 Г «x 7 
БТИ ЗИ иН M 
New York "Wadsworth 7 7 7 5 6 
*Andrews 6-5 4 5 Š 
Balaman] 2 3 3 ea 2^ & TA a 
Es zy Ai. Bonner 2) 3, 8 ома 
T E Bulwinkle 2 2 8 4 8 
Bloom 1-: 2d 1 1 Clark Pine war ae 
*Buck 7 5 4 5 Cooley 2 3 3 2 2 
ч оя 2 ла | pus 3.8 4' 378 
ies Non. ИШИМ Minen о тв СӘ 0 a 
Byrne 1. ( 1 1 1 Ervi d Te 
Celler 2 1 1 1 1 Folger 1 2 2 1 1 
"Соје Sa 7) Gets “fe 3/718 a ME 
Delaney, Jae. 1 : Weaver 2 ^9 “а 
Delaney, John 1 "MB sers 1 
*Elsaesser 6 4 5 5 | North Dakota 
*Fuller T 04 ОТТ, *Lemke @ .8 43978 
*Gamble И, aA Eb *Robertson 6 ВО 
*Gwinn 6 5 6 6 
Hah EA 0 5, 4 06 35 | Ohio 
*HallL.W. 6 6 4 6 6 *Bender 4 3 3 6 4 
*Hancock 077 "OF 6 *Bolton С ЗАК А ЕОМ 
Heffernan Lob. ЕГУ *Brehm 7T 0, 5. TP 
*Kearney [COE NE wu о *Brown T. o У а, 
Keogh Ж, Mb *Clevenger ТАТЕ ЂУЛА 
*Kilburn оф x бели Стовзег 1 T0 Lo 1 
*Latham БС КҮ *Elston 6 8) OTT 
*LeFevre Б 8, "6 T. 9 Feighan 1 1 1 1 1 
Lynch bolum Gardner 32.31 
Marcantonio 1 1 1 1 1 *Griffiths 7 5:55.68 6 
O'Toole 1; 4 Teese *Hess 7 T фу 
Pfeifer 2 1 2 1 1 Huber 1 1 1 1 
Powell 121-315 44 *1 *Jenkins T8 Е 


y 
12 DeanQR. Brimhall and Arthur S. Otis 


Table 8—Continued 


Т. Р. UD ДҮ JT I n ЛИ 
1944 1945 1946 1947 Avg. 1944 1045 1946 1947 Avg. 
*Jones У 7 5 6 *Kinzer T2 UE - 6 
Kirwan УК | 1 1 1 "Кипке! 5335 558 
*Lewis 56 5 4 ЖА *MeConell 7 5 4 7 6 
*McCowen ^. 6 б ENS MeGlinchey 1 1 
*MeGregor 6.9 D» (M6 Morgan $^ 1 d 1 
*Ramey + & 18 eyed Murphy 1 1 
*Smith v RI d 7 + 6 *Rich TOW cT 6 7 
Thom 2 1 *Rodgers 0018 6 
"Уогув p $255 6 6 Sheridan 2 2 H 
*Weichel DE VENE Ат *Simpson 77 6 аи 
*Tibbott 6 6 5 TEM 
er E Walter @ s 2 "m 
* 
Johnson да Мате 22 8 b pee tp M 
Monroney 2. 2 32024752 Rhode Island 
*Rizley #1257 Z^ 26457 Fogarty L2 1 1 
*Schwabe a ae Gen? Forand 1 1 1 1 1 
Stewart 4 
Stigler 5 Í 2 | South Carolina 
Wickersham 3 3 3 Bryson 8$. 3- 8 2 
Hare 3 4 4 
Oregon MeMillan 9 4 4 2728 
*Angell di By ug Richards 2 3 .9 sie 
*Elswrth 5 6 65 6 6 Riley 8 з 383 ae 
*Stockman ба 6 «Mug Rivers 5 0 4 3 
Pennsylvania South Dakota 
Barret 124131; *Case 4 6 4 4 M 
Bradley 1l 1 *Mundt b. T 7 "ME 
"Brumbauggh 7 6 5 
"Campbell EI "Tennessee 
А Соорег 2 3 3 ан 
Corbett 4 4 4 
Courtney 2 3: з 93" 
ЖОЛА МЕ БОТ d.d : 
5 Davis 27:29 2 872 
Fenton ^ 85 боё -5 Ear 
Flood CMS thman 4 6 
pe Gore 235 3 2998 
Fulton 2v ГҮ ee *Jenni 
Ма Jennings 43.6.06. TEN 
Gavin 8 97 S Re 
*Gerlach 9 LE 7 5 Kefauver 1 1 1 1 1 
Ole n KR NOMEN гу меаи, BO е TEM 
Беи ED eben Priest 1 9 2 8" 
Granahan fd “Reece TNT 
Green 1] Texas 
*Gross Per. hast Bekwoth 3 3 3 3 8 
Hoch Pee peo Combs 2 2 на 
Kelley 1 1 1 1 1 Fisher 5 4 5 4 4 


* 
Consistency of Voting by Our Congressmen 13 


Table 8—Continued 


A —— M = 


I E nu Iv E E 3H IV 
1944 1945 1946 1047 Avg. 1944 1945 1946 1947 Avg. 
Gossett i Fo А 4 Robertson 4 4 4 
Johnson,L.A.3 3 3 Smith 44^ 8784 8 
yep тоз 39$ 3. usi eon 
ilday В 4 0344245 ee рх EST 
Lanham 5 5 5 си 
Lyle 43$. Dolacy 17: 
y *Holm 5. ee ЕЈ 
Mahon EOBR Et ve 
Mansfield *Horan 6:72: "5 "ILS 
CELA II ME. Jackso 1o ЛТ 
Patman ür 4-5 45055. «33 TN 
Pickett Брена | эте куз 
Роаде 5 3 4 2 8 | West Virginia 
Russell t жор AA Bailey а а 
Sumners £35 4 *Ellis r m. ТАНИН 
Тһошаз з 3 43 а“ а Hedrick 1 2 "73 
Thomason S08 а ка Кее 1 1". УЙ 1 
West nugis v6 Neely 1493 
Worley тка, а“ !À Randolph $243 
Utah Wisconsin 
Granger Petr me 1 1 Biemiller 1 1 
Robinson 1 1 1 *Вугпез 6 6 6 6 
*Henry 5. 4 
Vermont Hull „АР. Т Жы у) 
*Plumley А о бб *Keefe в б би" 
Virginis *Murray 4 АЕ ar A 5 
»(у" 
Bland Se he dee or ОА a 
d 3 А : Simenon 8 4° 4 7 5 
Deis MEC RES Wasielewski 3 2 2 
Flannagan 0187113502 Wyoming 
Gary 2 "3. 28 «8 Barrett 6. 7. a 


first method would merely change from 7 to 6 by the second method. 
The degree of consistency is the same in both cases. 


Further Research Suggested 


One of the main reasons for making and reporting this study is to 
stimulate further research along this line. Psychologists have investi- 
gated the consistency of behavior of clerks, salesmen, aircraft pilots, 
mechanics, taxi-cab drivers, and a host of other occupations. Yet so far 
as we know this is the first time a study of consistency has been made of 
the most important employees of democracy. 


y 
14 Dean R. Brimball and Arthur S. Otis 


Further interesting researches would be to compare Republicans and 
Democrats either in their degrees of liberalism-conservatism or in their 
consistency of voting; to compare Southern Democrats with Northern 
Democrats; to investigate the consistency of voting in State legislatures, 
city councils, etc., and to compare scale values in liberalism-conservatism 
and other dimensions obtained in Congress with those previously had in 
State legislatures, etc. 

Received October 8, 1947. 
Early publication. 


Job Evaluation of Nonacademic Work at the 
University of Illinois * 


Alice May Jones 
University of Illinois 


Job evaluation is а systematic procedure which attempts to appraise 
the worth of a job, relatively considered with respect to other jobs in an 
organization, in direct proportion to its respective monetary value. J. 
E. Walters says, “It is the determination of the value of the job and not 
the worth of the employee doing that job" (6). 

The history of job evaluation is replete with literature and requires 
по elaboration. The various aspects of this field, particularly the point 
system, are reviewed by J. E. Walters (6). Time-study methods are pre- 
sented by Brash (3). The number of factors involved in a plan for evalu- 
ation of jobs is treated by Bradbury (2) and Viteles (5), and the relative 
weight each factor should bear is considered by Moore (4). 

The review of literature on this subject indicates it is common practice 
for supervisors and specialists to evaluate various jobs in an organization, 
while it reveals employee-participation in the job evaluation plans of 
only two organizations; namely, the General Foods Corporation (7), and 
a midwestern manufacturing plant of paper specialties reported by 
Benge (1). 

The purpose of this study is to describe the results of a plan of job 
evaluation at the University of Illinois which considered employee- 
participation, as well as the participation of supervisors and the personnel 
department. 


Procedure 


In the job evaluation plan which the University of Illinois adopted, 
nonacademie jobs requiring Civil Service appointment were evaluated 
on a point scale. Ву way of explanation, the Civil Service system in 
effect at the University of Illinois, which is entirely separate from the 
Civil Service system of the State of Illinois, contains 300 classifications, 


* Submitted as Master’s Thesis, under the direction of Professor T. W. Harrell and 
with the cooperation of the Non-academic Personnel Bureau, in partial fulfillment of 
the requirements for the degree of Master of Arts in Psychology in the Graduate School 
of the University of Illinois, 1946. 

15 


16 4 Alice May Jones 


involving 2300 employees, 1500 of whom are included in the 93 classi- _ 


fieations involved in this study. At the time of this study, ratings on 
the other 207 classifications, involving 800 employees, were not available. 
Since that time, these ratings have been submitted to the Office of Non- 
academic Personnel, and that office is now in the process of studying them. 

Regarding the classifications involved in this study, the Office of Non- 
academic Personnel sent a copy of “Job Evaluation Work Sheet” to every 
supervisor of and employee on all jobs under Civil Service appointment. 
Each supervisor and employee was asked to assign numerical values to 
any given job, according to the points allocated for each of four major 
factors, separated into 14 minor factors, listed on the “Description and 
Value of Factors used in Job Evaluation.” 

The four major factors, each of which had а maximum point value, 
were "preparation for the job,” 30 points; “personal qualifications," 30 
points; "working conditions," 10 points; and “responsibilities required 
on the job,” 60 points. Thus, 130 was the total number of points it was 
possible to assign to any one job classification. 

Forty-two supervisors submitted ratings on 62 jobs, ranging in numeri- 
cal value from 17 to 119 points, and employees of ten departments and 
committees submitted ratings on 64 jobs ranging in value from 13 to 103 
points. Of the 93 classifications involved, supervisors and employees 
each submitted ratings for 33 of the same classifications, the other 60 jobs 
being appraised by either supervisors or employees. In addition, each 
of the 93 classifications was evaluated by the writer under the direction 
of Dr. Arlyn Marks of the Office of Nonacademic Personnel, which here- 
inafter are termed “personnel department” ratings. The classifications 
evaluated by the personnel department ranged from 16 to 90 points. 

Three classifications were rated by ten or more supervisors; namely, 
“clerk-stenographer, junior" rated by 32 supervisors, “clerk-stenogra- 
pher, senior" by 21 supervisors, and “clerk-typist, junior” by 10 super- 
visors. Ten classifications were each rated by two employees, the 
remaining 54 classifications bearing single employee ratings. In this 
connection, it should be mentioned that 41 of the 64 classifications were 


evaluated by a committee which represented the combined opinions of 
many employees. 


Results and Discussion 


The reliability of the point ratings submitted by 21 supervisors on 
the classification “clerk-stenographer, senior," was determined in the 
following manner: The ratings submitted by the 21 supervisors were listed 
at random, and the number of points assigned by each supervisor to each 
of the 14 factors involved were tabulated. The averages for each of the 


Job Evaluation of Nonacademic Work 17 


14 factors for 10 supervisors were correlated with the averages for each 
of the 14 factors for the other 11 supervisors. The result was a correla- 
tion of .89 + .04, and when corrected for small number of cases, the 
correlation became .88. (The correlations stated in this study are cor- 
rected correlations, provided the number of cases involved is less than 50. 
If the number of cases is over 50, the correlation has not been corrected.) 

In the same manner as that for “clerk-stenographer, senior," the re- 
liability of 32 supervisors' ratings (two groups of 16 supervisors' ratings) 
for “‘clerk-stenographer, junior," was computed to Бе .95 + .02. 

In order to ascertain the reliability of supervisors’ point ratings on 
30 different job classifieations evaluated by two or more supervisors, 
total ratings for each job classification submitted by the various super- 
visors were listed randomly. The first two total ratings on each job were 
used in determining the reliability of such ratings, the first total rating 
on each job being an “X” and the second total rating on the same job 
being а “Y.” The X's were correlated with the Y's, and a correlation 
coefficient of .50 + .09 was computed. 

The reliability of the employees' point ratings on 10 jobs, which were 
appraised by two or more employees, was determined in the same manner 
as that for the supervisors. A reliability coefficient of .59 + .12 was 
found. 

In order to ascertain the degree of agreement among employees, 
supervisors and personnel department, intercorrelations were found for 
ratings submitted by these three groups on the classifications involved. 
The ratings submitted by supervisors were averaged into a single rating 
for each job classification, and, likewise, the ratings submitted by em- 
ployees were averaged into a single rating for each job classification. 
The ratings of the personnel department represented single ratings for 
each job. 

On the 62 classifications rated by both supervisors and personnel 
department, the correlation was .86 = .02. On the 64 classifications 
rated by both the employees and personnel department, the correlation 
was .85 + .02. On the 33 classifications rated by both supervisors and 
employees, the correlation of their ratings was computed to be .89 + .02. 
Considering the ratings of the personnel department on these 33 Civil 
Service classifications, the correlation between employees' and person- 
nel department's ratings was .90 = .02, and the correlation between 
the ratings of supervisors and those of the personnel department was 
.86 = .03. 

Algebraic differences between the point value ratings submitted by 
employees, supervisors and personnel department were determined to 
provide additional information regarding the relation between the evalu- 


18 Alice May Jones 


ations of these three groups. Considering the personnel department's 
ratings as a constant, the algebraic differences between the average rating 
of the employees and that of the personnel department on each of the 64 
classifications rated by both totalled 181.5 points, and the algebraic dif- 
ferences between the average rating of supervisors and that of the per- 
sonnel department on each of the 62 classifications rated by both totalled 
264.8 points. This means that the ratings of supervisors, considered as 
а total, were higher than those of employees, and that there was closer 
agreement between the ratings of employees and personnel department on 
these classifications, than there was between supervisors and personnel 
department on these classifications. 

The fact that the evaluations submitted by supervisors were higher 
than those submitted by employees for the same jobs may have been 
for one or more of three reasons: first, employees may rate their jobs 
conservatively because of modesty; second, if the supervisors assign high 
ratings to employees’ jobs, this fact promotes their prestige and impor- 
tance in their own jobs; and, third, supervisors may be anxious to seek the 
ea of their employees and so will assign high ratings to the employees’ 
jobs. 

As stated previously, it is a common practice for supervisors to rate 
jobs of their employees, but employees, themselves, are seldom asked to 
tate their own jobs, particularly when their ratings will be considered in 
a plan of job evaluation. This feeling of participation may have caused 


employees, more than supervisors, to be more conservative in their judg- 
ments of their jobs. 


Summary and Conclusions 


1. The results. of this study indicate a high degree of agreement 
among the point ratings of employees, supervisors and personnel depart- 
ment on the classifications involved, in so far as the relative levels of jobs 
are concerned, 

2. When algebraic differences were taken for the actual point ratings 
on the classifications involved, the evaluations of the employees agreed 
more closely with those of the personnel department, than did those of 
the supervisors. 

3. So far as the writer has been able to ascertain, there has been no 
correlation published for the reliability between the ratings of employees 
and those of supervisors or management in similar studies. The correla- 
tions between the ratings of employees and supervisors, between em- 
ployees and personnel department, and between supervisors and per- 
sonnel department, are given in this study. 

Received April 25, 1947. 


Y 


t 
Job Evaluation of Nonacademic Work 19 


References 


1. Benge, E. J. Job evaluation in a paper plant. Person. J., 1940, 19, 42-48. 

2. Bradbury, K. F. Job evaluation analyzed. Advanced Management, 1940, 5, 16-20. 

3. Brash, J. А. Time-study methods applied to job evaluation. J. consult. Psychol., 
1945, 9, 152-160, 

4. Moore, H. Problems and methods of job evaluation. J. consult. Psychol., 1944, 8, 
90-99. 

5. Viteles, M.S. A psychologist looks at job evaluation. Personnel, 1941, 17, 165-176. 

6. Walters, J. E. Personnel relations. New York: The Ronald Press Co., 1945. 

7. Job evaluation: Formal plans for determining basic pay differentials, Studies in Per- 
sonnel Policy No. 25, National Industrial Conference Board, 1940. 


Accident Proneness Among Street Car Motormen and 
Motor Coach Operators 


Clarence W. Brown and Edwin E. Ghiselli 
University of California 


The term, accident proneness, has been used in two somewhat different 
meanings. In some instances the term is used to refer to the individual 
who is involved in many accidents of thesamesort. Thus the operator of 
a public conveyance who has had many more accidents than the average 
of his fellow operators is termed accident prone. In this sense the term 
implies sheer quantity of accidents. In other cases accident proneness 
is used to describe the individual who has had accidents of several different 
kinds. Thus the individual who has had accidents at home, at work, 
and on the public highways is called accident prone. 

While these two uses of the term, accident prone, are not wholly 
different, the second interpretation has certain implications not contained 
in the former. The facts of individual differences in accident rate is 


between rates of accidents of different kinds. If, for example, it were 
found that persons who had most accidents at work also were those who 
or if persons having most minor accidents 
also had most major accidents, then positive evidence would be at hand 
for supporting the notion of accident proneness as a general trait, Low 
or zero coefficients of correlation among different types of accidents would 
argue against such a concept. 

While the evidence concerning the validity of the concept of accident 
proneness as a retention of liability under all circumstances is far from 
neral agreement, among investigators to the 
among accidents of different kinds are low. 
Working with groups of factory workers, Newbold (1) found coefficients 


Accident Proneness 21 


might be considered representative of their data. For motor coach opera- 
tors, Brown and Ghiselli (3) found а coefficient of correlation of .25 be- 
tween collision and non-collision accidents. 

One of the major limitations of the data collected to date on the 
generality of accident proneness is the scarcity of information on the 
reliability of the measures employed in measuring accident rate in the 
situations being compared. Numerous findings on the reliability of 
accident rates have been published when only individual accidents were 
under investigation. "These reliability coefficients, of course, vary with 
the nature of the job involved and with the length of time considered. 
While they frequently approach zero they seldom rise above .80, most of 
them falling in the neighborhood of .30 to .60. It is apparent, then, that 
the coefficients of correlation between different types of accidents as 
reported by various investigators are considerably lower than the relia- . 
bility coefficients of measures of accident rates. However, since data re- 
lative to the intercorrelations between different types of accidents and the 
reliabilities of those same types of accidents are not usually available, 
only very tentative conclusions can be drawn concerning the concept of 
accident proneness as a general trait. In an attempt to secure further 
evidence on this matter, the present investigation was undertaken. 


Basic Data 


The men considered herein are 59 trolley car motormen and 34 motor 
coach operators employed in a city transit system. АП of these men 
had had two years experience on their jobs. The accident records taken 
for them covered a period of 18 months, and consisted of the entire year 
1945 and the first half of 1946. All irregular cases, such as those who were 
under suspension or were absent without leave, etc., were eliminated. 

The accidents of motormen were classified as follows: collisions with 
pedestrians, collisions with other trolley cars, collisions with motor 
vehicles, boarding and alighting accidents, and accidents aboard the car 
such as slips, stumbles, falls, etc., of passengers. "These accidents were 
also grouped into the two major categories of collision and non-collision 
accidents. Since for the motor coach operators the rates for collisions 
with other than motor vehicles and for on-board accidents were very low, 
only the more inclusive grouping of collision and non-collision accidents 
was used. 


Results 


For all types of accidents, the reliability coefficients were estimated 
from the coefficient of correlation betweett the number of accidents on the” 
odd and even months corrected by th ifikn-Brown formula. These 


, ` 
22 Clarence W. Brown and Edwin E. Ghiselli 


coefficients, given specifically in Tables 1 and 2, varied from .19 to .80, 
most of them being about .40. 

In Table 1 are given the reliability coefficients, intercorrelations,and 
distribution constants for the various types of accidents of the streetcar 


Table 1 


Reliability Coefficients, Intercorrelations and Distribution Constants of Various 
Types of Accidents of 59 Trolley Car Motormen 


T ———— M ————————————— 


Standard 
1 2 3 4 5 Mean Deviation 
rr LA Пан Min NN 
1. Collisions with 46 10 —.11 12 02 75 1.54 
pedestrians 4 
2. Collisions with 292 04 17 44 65 
trolley cars 
3. Collisions with 42 08 107 12.22 5.87 
motor vehicles 
4. Boarding and alight- P ae US 42 .69 
ing accidents р 
5. On-board accidents .65 27 45 


motormen. It will be noted that the coefficients of correlation between 
the various types of accidents are generally quite low, and one, indeed, is 
negative. Since the reliability coefficients of two of the types of accidents 
—collisions with other trolley cars and boarding and alighting accidents— 


Table 2 
Reliability Coefficients, Coefficients of Correlation, and Distribution Statistics 
of Collision and Non-Collision Accidents 


и a a 


Trolley Car Motormen. N = 59 


Collision Non-collision Standard 
Accidents Accidents Mean Deviation 
Collision 42 .09 7.76 6.04 
accidents 
Non-collision 40 1.14 .89 
accidents 


ашылык ша ы ШЫ ЕМИ is RN ELE iso 


Motor Coach Operators. N = 34 


` Collision Non-collision Standard 
Accidents Accidents Mean Deviation 
Collision 82 13 5.18 2.97 
accidents 
^  Non-collision .80 1.68 1.40 
accidents 


A € 
T Accident Proneness 23 


are quite low, comparisons are difficult. Even with the remaining more 
reliable measures, however, the intercorrelations do not differ greatly 
from zero. When the correlations between collision and non-collision 
accidents, shown in Table 2, are considered, it is again apparent that the 
relationship between different types of accidents is low. 


Conclusions 


From the evidence reported by other investigators, together with that 
presented in the present report, it cannot be considered that accident 
proneness as a general trait of the individual has been substantiated. 
Simply because an individual has a high rate for accidents of one kind, it 
does not necessarily follow that he will have a high rate for accidents of 
another kind. Rather, if there is any tendency to retain liability to have 
accidents under many different circumstances, the facts would indicate 
that such a tendency is of very minor importance as a factor in the 
determination of accidents. 


Received May 10, 1947. 


References 


1. Newbold, E. M. A contribution to the study of the human factor in the causation 
of accidents. Indus. Fat. Res. Bd., Report, No. 34, 1936. 

2. Farmer, E., and Chambers, Е. 6. A study of personal qualities in accident prone- 
ness and proficiency. Indus. Fat. Res. Bd., Report, No. 55, 1929. 

8. Brown, С. W., and Ghiselli, Edwin Е. Factors related to the proficiency of motor 
coach operators. J. appl. Psychol., 1947, 31, 477-479. 


The Guilford-Zimmerman Aptitude Survey 


J. P. Guilford and Wayne S. Zimmerman 
University of Southern California 


It is the purpose of this article to describe the development of the 
Guilford-Zimmerman Aptitude Survey (7), to cite some of its novel fea- 
tures, and to present some preliminary тези а! 


Reasons for the Survey 


The Factorial Approach {о Aptitude Testing. First of all, it is our 
conviction that aptitudes of individuals can be evaluated most ade- 
quately, economically, and meaningfully by using a series of tests each 
of which measures a unique ability (4). Secondly, we believe that the 
aptitudes required for doing successfully the many kinds of tasks in a 
complex society are much more numerous and varied than has generally 
been supposed (2, 6). Thirdly, we emphasize the importance of гес- 
ognizing more fully than heretofore, that certain of these unique abilities 
are measured best by “power” tests and others by "speed" tests. Fourth- 
ly, the knowledge and understanding of ability factors have grown 
rapidly during the past few years, especially as а result of war time re- 
search, and application should be made of these advances (3). 

In the past, tests of intelligence, of clerical ability, and of mechanical 
ability have served well as far as they go. It is now realized, however, 
that they do not go far enough. Before the recent war, it had been 
shown by statistical analysis that human resources, as measured by 
tests, fall into rather separate and distinct traits to which Thurstone has 
given the general name of "primary abilities” (9). The accelerated devel- 
opment of knowledge of aptitudes during the war has served to confirm 
the belief in aptitude factors or primary abilities, and to advance con- 
siderably the information concerning them (3). 

Advantages of Factor Tests. The most important advantages of a 
series of tests each of which is relatively unique are (4): 


1. Wide Coverage at Economical Cost. In most batteries of tests of 
the traditional type there is so much duplication in the measurement 


!In this article portions of the test manual (7) written by the same authors have 


been liberally quoted, without the use of quotation marks, by permission of the copy- 
right holder. 


24 


‹ 
Guilford-Zimmerman Aptitude Survey 25 


that there are many more tests than there are factors covered. Further- 
more, the number of factors actually touched is generally quite limited. 
Success in a job or assignment often depends upon a large number of 
factors. For example, it has been estimated that success in pilot training 
in the Army depends upon at least 20 separate abilities and traits (3). 
The best way to be sure that all aspects of a job are covered economically, 
is to recognize the factors and to provide the minimum number of tests 
needed. 

2. Meaningfulness and Dependability. When a factor is measured 
by a test that is unique for that factor, we know rather definitely what 
the score means. On the other hand, if a single test measures two or 
more common factors, an individual’s score cannot be interpreted with 
confidence. A better-than-average score in the test may be due to one 
exceptionally strong ability combined with one or more weak abilities, 
or it might just as well be due to equivalent strengths in all abilities meas- 
ured. For use in vocational guidance, tests should yield maximal dif- 
ferences between scores for each person. This is best achieved by means 
of factor tests. Furthermore, experience has shown conclusively that 
complex tests do not measure any one factor as well as do factorially 
“pure” tests. 

3. Adaptation to Batteries. With a battery of factorially unique tests, 
one can select a single test for each factor that should be measured, ex- 
clude factors that are not needed, and give each test its optimal weight 
and thus achieve maximal prediction. With complex tests this usually 
cannot be done, since a good measure of a desired factor may be (and 
often is) a measure also of an unrequired or undesired factor. Thus, 
Weighting a test weights the undesired factor as well as the desired one. 

4. Enlightened Selection of Batteries. The best knowledge to have 
concerning the assignment for which selection is to be made is in terms of 
the primary abilities required for success in that assignment. Familiarity 
With the factors and the tests that measure them will often be sufficient 
to serve as a basis for effective choice of tests until further information 
becomes available. An inspectional job analysis made with the primary- 
ability categories in mind is the most effective preliminary approach in 
this connection. If the job requires rapid and accurate computations as 
à significant component, then the number factor should be measured. 
If the job requires careful and accurate inspection of minor details, then 
the perceptual-speed factor should be measured. The dependability of 
these better known factors is such that if the observation of the job activi- 
ties is correct, one would rarely go far astray in the choice of selective tests 
following the approach recommended. 


, 
26 J. P. Guilford and Wayne S. Zimmerman 


Plan of the Survey 


Traditional Areas Covered. The Aptitude Survey has been planned 
to include, when completed, tests of approximately 20 primary abilities. 
The seven parts already published cover fairly well the important factors 
in the three traditional areas—abstract intelligence, clerical aptitude, and 
mechanical aptitude. The same seven tests may be combined in various 
ways to measure aptitudes for many other occupational areas. 

Parts I and II (Verbal Comprehension and General Reasoning) 
measure the two leading components of most verbal intelligence tests. 
Since these two factors are practically independent, it is possible for an 
individual to have quite different standings in the two tests. For this 
reason, it is highly desirable to have at least two separate scores to rep- 
resent this area of aptitude. 

Parts III and IV (Numerical Operations and Perceptual Speed) meas- 
ure leading components of most clerical-aptitude tests. Different kinds 
of clerical jobs depend to different degrees upon these two primary abili- 
ties. Every clerical job must be inspected in its own right for its require- 
ments in terms of human resources. Some clerical jobs also lean more or 
less heavily upon verbal and reasoning components. Selection for such 


assignments would call for the use of Parts I and II of the Survey. Some - 


jobs might well involve significant amounts of spatial and visualizing 
abilities which are represented by Parts V and VI. The term “clerical- 
aptitude” is a very rough occupational concept, not conforming to a 
single, fixed pattern of psychological functions. This situation calls for 
the use of different combinations of unique tests, differently weighted to 
meet special requirements for each type of clerical assignment. 

Parts V, VI, and VII (Spatial Orientation, Spatial Visualization, and 
Mechanical Knowledge), particularly Part УП, cover the most important 
abilities required in mechanical pursuits. Like clerical aptitude, the 
concept of “mechanical aptitude” is variable. The common element 
that justifies the idea of mechanical aptitude is the machine, from which 
the concept takes its name. But there are many kinds of things one 
can do with, or in connection with machines. One can invent them, de- 
sign them, draw blueprints of them, read blueprints of them, construct 
them, operate them, or repair them. Little reflection is required to re- 
alize that the psychological resources important in all these connections 
are by no means the same. Extensive statistical analyses of mechanical 
tests have fairly well clarified this area of aptitude (3). The only under- 
lying ability that is unique to mechanical tests proves to be a mechanical- 
experience factor. This factor seems to depend upon the amount of 
basic, simple, practical mechanical knowledge that the individual has 
acquired, formally or informally. 


& 
Guilford-Zimmerman Aptitude Survey 27 


Parts V and. VI stress other factors, both of which сап be measured 
independent of the mechanical-experience factor, but which are relatively 
important in many kinds of mechanical tasks. It is expected that Part 
V, Spatial Orientation, will be found most useful in prediction of success 
of street car operators, bus or truck drivers and other machine operators, 
for the ability measured is found as a component of psychomotor tests 
in which mechanical devices must be manipulated. Visualization should 
be most important in connection with designing, and with the making 
and reading of blueprints, though it is known also to be important in 
some machine-operating tasks such as piloting an airplane. 

New Areas of Aptitude Covered. It should be clear from what has 
been said that the three traditional aptitude areas, as customarily covered 
by tests, have been broken down into their chief fundamental aptitude 
factors and it is believed that those areas can thus be assessed by factor 
tests in a much more enlightened manner than heretofore. As it was 
pointed out before, the seven tests go much beyond the three traditional 
areas. There is no necessity of restricting the groupings of the tests to 
these three ways. By cutting across these traditional boundaries, a great 
many other types of vocational and professional aptitudes can be assayed. 
The various arts and jobs that combine academic and either clerical or 
mechanical aspects call for different regroupings of tests. When the list 
of primary abilities tested is later extended beyond the seven, the possi- 
bilities for other, non-traditional, groupings will become more apparent. 

The Factors and their Measurement. In Parts I through VII, the 
factors and their measurements may be described as follows (2, 3, 6): 


1. Verbal Comprehension. "This ability is important in any activity 
that requires understanding or meanings of words or verbal concepts. 
A general vocabulary test, given as a power test, is the best and purest 
measure of it. It is commonly evident in such tests as same-opposites, 
verbal analogies, and reading comprehensions. These three types of 
tests are likely to measure individual differences in other factors which 
may not be wanted. Reading-comprehension tests may introduce a 
variety of factors, depending upon the content of the reading matter 
used, 

2. General Reasoning. There are several distinct types of reasoning 
ability, but the one measured by Part II seems to be common to a larger 
number of tests than any of the others. It probably makes its strongest 
contribution in tests composed of items which emphasize the principles 
characteristic of arithmetic-reasoning problems. From this, and from 
other facts, it appears that the key to the factor is the ability to diagnose 


Problems. It is best measured in a power test with items of appropriate 
difficulty. 


28 J. P. Guilford and Wayne S. Zimmerman 


3. The ability to do rapid and accurate work with numbers stands 
apart as a distinct human resource. The fact that the extremely rare 
person may stand out as a "lightning caleulator" and yet possibly be 
generally low in most other abilities is a dramatic demonstration of the 
independence of this factor. The factor seems to be about equally well 
measured by any of the four fundamental operations, addition, sub- 
traction, multiplication, and division. Since correct answers to the prob- 
lems can be derived by nearly all examinees, given sufficient time, the test 
must be administered under speed conditions. 

4. Perceptual Speed. This is the ability to perceive detailed visual 
objects quickly and accurately. Part IV is composed of short matching 


tests in which the examinee must note similarities and differences in the - 


forms and details of common objects. The items must necessarily be 
quite easy. If not, they would involve other abilities such as discrimina- 
tion of size and shape, and perhaps visual асињу and reasoning. Since 
the items are easy, individual differences must be measured largely in 
terms of speed. 

5. Spatial Relations. Part V was designed to measure primarily an 
ability to appreciate spatial relations of things with reference to the 
human body. The awareness of whether one object is to the right or 
left of another, higher or lower, or nearer or farther away, seems to be the 
essential nature of this factor. Both the proper interpretation of struct- 
ures and the proper decisions as to how to adjust the body to the layout 
of machinery depend upon this ability. It was found to be one of the 
two most important factors for learning to pilot an airplane. It is pre- 
sumably important in other jobs involving machine operations, since 
nearly all tests known to contain the factor in some degree appear to have 
validity for predicting success in mechanical jobs. In Part V, each 
item shows the prow of a motorboat against a background scene in two 
similar views. The examinee must report what directions the boat has 
moved in going from the first to the second of the two pictures. The 
boat may have turned right or left, may have risen or fallen, and/or may 
have tilted right or left. 

6. Spatial Visualization. This factor seems to involve the process 
of imagining Movements, transformations, or other changes in visual 
objects. It is a dynamic kind of visualization, whereas another factor 
identified as visual memory is a static or reproductive visualization. It 
is represented in tests of mechanical movements, mechanical compre- 
hension, and in paper-folding tests of the Binet type. Such tests involve 
factors in addition to visualization and were therefore not used in the 
Survey. It should be pointed out that before the war the two factors— 
spatial relations and visualization—had not been separated but were 


$ 


Guilford-Zimmerman Aptitude Survey 29 


regarded as one factor called spatial. This distinction has been confirmed 
by a number of studies (3). 

Part VI is composed of pictorial items in each of which a familiar 
three-dimensional object is first shown in a certain position. Brief verbal 
instructions then call for turning, rotating and/or tilting the object. 
The examinee's task is to recognize the object in the new position. Sur- 
face-development tests also measure this factor, but nearly always, in 
our experience, also stress the general-reasoning factor. 

7. Mechanical Experience. lt was indicated earlier that the only 
factor unique to mechanical tests is an acquired-knowledge or experience 
factor. Part VII tests the examinee’s knowledge of common tools and 
mechanical problems such as occur around the home, of automobile parts, 
their functions and malfunctions, and of the common trades of carpentry, 
plumbing, welding, and the like. It is recognized that not all examinees 
have had equal opportunity to acquire this kind of knowledge. Girls 
and women are particularly handicapped in this respect, even though the 
emphasis is upon more common experiences. It has been found that 
individual differences in scores made on this test by males are highly 
prognostic of subsequent success in undertakings of a mechanical nature, 
€.g., aircraft mechanic and radio operator mechanic. 


Construction of the Tests 


Some New Features in Test Construction. Several innovations in test 
development were employed in the construction of the Survey tests. In 
writing items for Part I, Verbal Comprehension, an attempt was made 
to keep the level of difficulty of all alternative responses equivalent to 
that of the word to be defined. In some vocabulary tests the responses 
ате somewhat easier than the word to be defined and аге of unequal famili- 
arity. It is believed that the discrimination of the right answer from the 
Wrong ones in Part I is more likely than usual to require knowledge of 
each alternative word meaning. If this is so, each item represents a 
wider sampling of word knowledge than is customary. Furthermore, the 
examinee cannot so easily arrive at the right answer by eliminating more 
familiar (for him) wrong answers. The latter kind of solution, by elimi- 
nation, is likely to involve reasoning as well as word knowledge. Ex- 
Perience with Part I has shown an unusual tendency for examinees not 
to guess, as indicated by the large number of omissions. 

he approximate difficulty level of each word was determined origi- 
nally by the use of the Thorndike and Lorge lists (8). Subjective judg- 
ment was introduced when it seemed that frequency might not be the 
only clue to difficulty of meaning. A check upon the success of this 
method of word selection was made in an item analysis. It was found 


30 J. P. Guilford and Wayne 8. Zimmerman 


that the rank order of difficulty of right responses, as shown by percent- 
ages of examinees who chose them, correlated .87 with the difficulty order 
assumed. In the verbal test the five alternative answers are all of the 
same part of speech—noun, verb, or adjective—as the word to be de- 
fined. Verb forms are clearly indicated by the use of the infinitive form. 

In the development of both Parts I and II, longer, preliminary forms 
(100 items for Part I and 30 items for Part II) were administered to 400 
lower-division university students and item analyses were made. The 
rank order of items for difficulty was established, and each item was 


range of difficulty rather evenly spaced over the range, and of satisfac- 
tory correlation with total score. The frequencies of choices of wrong 


for Parts III, IV, and V because they are speed tests in which the usual 
item statistics are of little value. Parts VI and VII were not analyzed 
because on the basis of previous experiences it was believed that the items 


only .19 in one sample and .20 in another, demonstrating the degree of 
Success achieved before item selection occurred. Part III presents а 
somewhat, novel mode of listing alternative answers, in which the same 
six alternatives are used to serve two problemsincommon. This practice 
conserves upon booklet space and probably upon reading time. 

Administrative Features, For all parts, the instructions are made so 
complete that the tests are largely self-administering. Separate answer 
sheets can be used with all parts except III and IV. 


Some Uses for Which the Tests are Recommended 
; Kinds of Examiness to Which Adapted. "The following list of occupa- 
tional assignments for which it is believed each part of the Survey is pre- 
dictive, is based partly upon known relatively direct evidence, partly 


Guilford-Zimmerman Aptitude Survey 31 


upon known indirect evidence, and partly upon enlightened guesses. 
This list is ап abridged version of one provided in the test manual (7). 
It is presented merely for its suggestive value. 


Lists of Occupations in Which Factors Measured by the Guilford-Zimmerman 
Aptitude Survey are Probably Important 


Part I. Verbal Comprehension 
author 
editor 
engineer 
historian 
journalist 

wyer 
navigator 
physician 
scientist 
teacher 


Part II. General Reasoning 
accountant 
engineer 
executive 
lawyer 
manager 
mathematician 
navigator 
physician 
scientist 
teacher 


Part III. Numerical Operations 
accountant 
actuary 
auditor 
bookkeeper 
calculator 
mathematician 
mathematics teacher 
navigator 
statistical clerk 


Part IV. Perceptual Speed 
aircraft pilot 
clerks (most kinds) 
inspectors (of products) 

ey punch operator 

navigator 
printer 
Proofreader 
typist 


Part V. Spatial Orientation 


aircraft pilot 

athlete 

blueprint reader 
dentist 

draftsman 

steam shovel operator 
street car operator 
truck or bus driver 
winch operator 


Part VI. Spatial Visualization 


aircraft pilot 
blueprint reader 
dentist 
designer 
draftsman 
electrician 
engineer 
inventor 
machine operator 
navigator 
surgeon 


Part VII. Mechanical Knowledge 


aircraft pilot 

carpenter 

chauffeur 

eno 

lathe operator 
locksmith 

machinist (most kinds) 
mechanic (most kinds) 
repairman (most kinds) 
toolmaker 

tractor operator 

truck driver 


Some Preliminary Results 


Reliability. For Parts I, II, VI, and VII the odd-even reliability 
estimates (corrected by the Spearman-Brown formula) are .92, .92, .91, 


32 J. P. Guilford and Wayne S. Zimmerman 


and .92, respectively. The samples include from &bout 100 to more than 
200 college men. For Parts III, IV, and V, reliability was estimated by 
administering each test in two separately timed, equivalent halves, inter- 
correlating the part scores and applying the Spearman-Brown formula. 
For these three parts the estimates are “92, .92, and .88, respectively 
(N’s = 113, 108, 220). 

Validity. There are two types of validity for tests: factorial validity 
and practical validity (1). Factorial validity is indicated by the correla- 
tion of a test with a primary ability. Such a coefficient tells how well the 
test is measuring a factor. Practical validity is indicated by the correla- 
tion of the test with some criterion of adjustment, such as vocational 
proficiency. 

At the time this article was written, there were no practical validity 
data to report concerning the tests in the Survey. Later supplements 
to the Manual will report such information, since a number of validity 
studies are in progress. 

These particular tests have not yet been factor analyzed, so precise 
estimates of factorial validity are lacking. It can be stated with some 
justification, however, on the basis of known factorial composition of 
very similar tests (3) that the seven tests should be expected to have the 
following correlations with their dominant factors. Part I, .80; Part II, 
60; Part III, .80; Part IV, .75; Part V, .60; Part VI, .60; and Part VII, 
80. These factorial validities are probably close to the best obtainable 
for the factors under present testing conditions, 

Intercorrelations. A set of factorially pure tests would have zero 
intercorrelations if the factors are unrelated. It is believed that the inter- 
correlations of the seven parts of the Survey will be very low, with few 
exceptions. In these instances a significant intercorrelation may be at- 
tributed to the presence of an additional factor or factors common to the 
two tests. It is possible that in one or two instances it will be desirable 
to suppress any unwanted factor variance in the test by combining its 
score in a linear equation with that of another test. Formulas have been 
provided for such an operation (5). 


А few intercorrelations already computed can be reported. Some of 


them would be significantly greater than zero. Between Part I (prelim- 
inary form) and Parts II (preliminary form), VI, and VII the correla- 
tions are .19, .14, and .10 respectively (N’s = 84, 108, 107). It is of 


Guilford-Zimmerman Aptitude Survey 33 


Between Part II (preliminary form) and Parts III, IV, V (booklet 
form), and VII the correlations are .19, .13, .27, and .16, respectively 
(N’s = 142, 180, 180, 118). The correlation of .19 is strong evidence 
that the number factor was reduced to an unprecedented low status in 
a test made up of items involving the principles found in arithmetic- 
reasoning problems. The attempt by the authors to construct arithmetic- 
reasoning-type problems which could be stripped of their numerical con- 
tent, retaining only the desired reasoning element, was apparently quite 
successful. This was done on the basis of previous studies which indi- 
cated repeatedly that certain types of arithmetic reasoning problems 
contain the greatest amount of the general-reasoning factor. The low 
correlation with the perceptual-speed test indicates that the perceptual 
items were of a sufficiently low level of difficulty that a good general- 
reasoning ability could not be substituted to solve the problems when 
perceptual speed was lacking. The low correlation of the reasoning 
score with the mechanical score is evidence of a proper difficulty level 
for the latter test and of the omission of items that can be reasoned 
through to a solution, such as are likely to appear in tests of the mechan- 
ical-principles type. | 

Between Parts III and IV the correlation is .24 (N = 243), between 
III and V (booklet form) it is .00 (N — 119) and between IV and V 
(booklet form) it is .28 (N = 116). When separate-answer-sheet record- 
ing is used with Part V, the correlations with some other Parts range 
quite a bit higher. Since the booklet form of Part V obviously is more 
pure, the use of this form is recommended. 

The correlation of the answer-sheet form of Part V with Part VI was 
found to be .55. When a different scoring formula was tried involving 
a stronger penalty for errors in Part VI, the correlation was .48. When 
part scores for Part VI were obtained and correlated with the booklet 
form of Part У (N = 134), the correlations were found to range as low 
88.33. No specific recommendations are being made at the present time 
regarding the application of special scoring formulae or of utilizing part 
Scores. The study is being continued, however, and when more confident 
interpretation of these correlational variations can be made, the findings 
will be reported in supplements to the manual. 

Norms. Norms are available on a college population and are being 
accumulated on industrial and high-school populations. 


Summary 


In the Guilford-Zimmerman Aptitude Survey attempts are being made 
to produce a series of tests each of which features one primary ability. 
Seven parts already completed cover the most important factors in aca- 


- 


34 J. P. Guilford and Wayne S. Zimmerman 


demic, clerical, and mechanical and other aptitude areas. New рго- 
cedures were utilized in the attempt to achieve factorial uniqueness. 
Preliminary data indicate fairly satisfactory results toward this end. 
It is proposed that such a battery will serve as a flexible, efficient, and 
manageable device for assessing individuals in many aptitude areas and 
will yield unusually interpretable and dependable evaluations. 

Received December 1, 1947. 


Early publication. 
References 
1, Guilford, J. P. New standards for test evaluation. Educ. and Psychol. Meas., 
1946, 6, 427-438. 


2. Guilford, J. P. The discovery of aptitude and achievement variables. Science, 
1947, 106, 279-282. 

8. Guilford, J. Р. (Ed.). Printed classification tests. Army Air Forces Aviation Psy- 
chology Program Research Reports, Report No. 6. Washington, D. C.: Govern- 
ment Printing Office, 1947. 

4. Guilford, J. Р. Factor analysis in a test development program (to appear in the 
Psychol. Rev.). 

5. р J. P., and Michael, W. В. Approaches to univocal factor scores (to appear 
in ika). 

6. Guilford, J. P., and Zimmerman, Wayne S. Some AAF findings concerning aptitude 
factors. Occupations, 1947, 26, 154-159. 

7. Guilford, J. P., and Zimmerman, Wayne S. The Guilford-Zimmerman aptitude 
survey. Beverly Hills: Sheridan Supply Co., 1947. 

8. Thorndike, E. L., and Lorge, I. The teachers word book of 30,000 words. New 
York: Teachers College, Columbia University, 1944. 

9. у ор, L.L. Primary mental abilities. Chicago: University of Chicago Press, 


Veterans’ Scores on the Purdue Pegboard Test 


J. R. Strange and A. Q. Sartain | 
Southern Methodist University 


“The Purdue Pegboard,” in the words of its authors, “is a test of 
manipulative dexterity designed to assist in the selection of employees in 
industrial jobs requiring manipulative dexterity, such as assembly, pack- 
ing, operation of certain machines, and other routine manual jobs of an 
exacting nature."! The test consists of a board with two rows of small 
holes into which metal pegs may be inserted. It yields two types of 
scores, one for the number of pegs placed in the holes, and one for the 
number of assemblies (consisting of а peg, а washer, a collar, and ап- 
other washer) completed. For the placing of pegs scores may be obtained 
for the right hand alone, for the left hand alone, for both hands working 
together, and for a total of right hand, left hand, and both hands. 


Statement of the Problem 


One problem of this study was to determine whether, in the light of 
the norms furnished by the authors of the Purdue Pegboard, veterans 
applying for vocational guidance constitute an inferior, an average, or a 
superior group. Another problem was to intercorrelate the two most im- 
portant scores yielded by the test and to correlate each of these scores 
with scores made on the Minnesota Rate of Manipulation Test, the latter 
Yielding a Placing and a Turning score. 


Subjects and Conditions of Study 


The tests were given at the Southern Methodist University Veterans’ 
Testing Bureau located at Dallas, Texas. This Bureau was set up by 
the University at the request of the Veterans’Administration for the 
purpose of vocational testing of veterans. Subjects for this study were 
Picked at random from all the veterans who had taken the test and in- 
. cluded both able-bodied and disabled men. Scores obtained from 850 
subjects were used to build a set of “local norms” for the group on the 
Pegboard. An intercorrelation was run between Pegboard Placing ? and 

1 “Preliminary Manual for the Purdue Pegboard,” Purdue Research Foundation, 
Purdue University, р. 1. 


i a Placing score referred to is the total for right hand, left hand, and both hands = 
gether, 


35 


36 J. R. Strange and А. 0. Sartain 


Pegboard Assembly scores, using 447 subjects. For 75 subjects who had 
taken both the Purdue Pegboard and the Minnesota Rate of Manipula- 
tion tests, correlations were run between Purdue Placing and Minnesota 
Placing, Purdue Placing and Minnesota Turning, Purdue Assembly and 
Minnesota Placing, and Purdue Assembly and Minnesota Turning. 


Results of the Study 


Tables 1, 2, and 3 give the results of this study so far as the percentile 
rank of each score is concerned "Thus, when using the right hand the 


Table 1 


Percentile Ranks of Scores Made by Male Veterans on Three Subtests 
; of the Purdue Pegboard Test (N = 850) 


TM ——————————————— 


Percentile Rank 
Hive ыу нем Н a у 26240777. 9 00 1 M MGR 
Score Right Hand Left Hand Both Hands 

23 99.9 
22: 99.8 
21 99.1 99.5 
20 97.4 98.9 
19 3 92.5 96.8 
18 80.9 91.0 99.9 
17 61.6 74.8 99.1 
16 42,4 55.8 97.8 
15 24.9 36.6 90.5 
14 12.0 20.5 75.9 
18 04.7 09.7 53.1 
12 01.3 03.3 28.2 
11 00.4 01.4 13.2 
10 00.1 00.2 05.1 

9 00.1 01.9 

8 00.1 00.5 

7 00.1 00.1 


пре ШП ИН АРА a ИИА 


placing of 21 pegs in the holes in the allotted time was sufficient to put 
the man in the top 1% of the group. For the left hand the corresponding 
score was 20, for both hands 17, and for the total 56, Likewise 124% 
assemblies were required for ranking in the top 1%. Similar statements 
could be made about the other percentile ranks. 

Table 2 also compares the experimental group with the two groups of 
subjects on whom the test was standardized. It is evident that our 
“Лоса! norms" for the veterans’ group are higher at every point than those 
for the men on whom the test was standardized. Indeed, the percentile 


* The reference in all cases is to a single trial and not to the average of three trials. 


Veterans! Scores on Purdue Pegboard Test 37 


Table 2 


Comparison between Male Veterans! and Standardizing Groups in Total 
Placing Scores on the Pegboard Test (N = 850) 


Percentile Rank 
Standardizing Group 
Veterans’ Am 
Score Group Men Women 

62 100 
61 
60 99.9 
59 99.8 
58 99.7 
57 99.5 
56 99.1 99 
55 98.5 99 
54 96.7 98 
53 94.6 100 
52 91.5 95 
51 87.8 
50 81.5 99 88 
49 78.9 
48 66.8 98 78 
47 58.9 96 
46 49.1 9з 63 
45 416 87 
44 33.8 80 46 
43 27.6 78 
42 21.6 64 29 
4l 16.4 54 
40 113 44 17 
39 08.6 34 
38 05.5 26 8 
37 03.9 17 
36 02.5 16 3 
35 01.8 10 
34 01.1 6 1 
33 00.5 3 vx 
32 00.4 2 
31 00.3 1 
30 00.3 
29 00.1 


———————————————————mm 


tanks obtained in the present study are much closer to those obtained 
from the women, rather than from the men of the standardizing group.‘ 
Table 3 gives similar information for the Assembly test. Here the 


у * Since means and standard deviations are not given for the standardizing groups, 
Significance ratios could not be calculated. 


38 Ј. R. Strange and A. Q. Запат 


Table 3 
Comparison between Male Veterans' and Standardizing Groups in 
Assembly Scores on the Pegboard Test 
T.U —є—Є—Є——————Є 
Percentile Rank 
ааыа 
Score Ko Men Women 
_———————_——— O OO UUSA 
1 99.9 
"y 99.8 100 
131 99.7 
13 99.7 
13} 99.7 
13 99.7 99 
121 99.5 
124 99.1 
12} 98.6 
12 98.1 95 
1i 96.9 
1 96.2 
ur 95.4 
u 934 100 86 
103 91.1 99 
10} 87.4 98 
104 82.9 97 
10 77.2 96 66 
9i 70.0 94 
9j 62.8 92 
91 57.9 89 
ч 50.5 86 44 
8: 41.9 75 
8} 35.6 69 
8 31.2 65 
ч 249 . 57 25 
та 18.9 45 
7 15.9 40 
т 13.4 33 
7 10.3 28 12 
et 07.2 19 
8} 05.3 17 
St 03.9 14 
Р 02.8 12 4 
51 01.5 8 
5j 01.1 7 
51 00.6 5 
5 00.4 3 1 
4 00.3 
4 00.2 
4t 00.1 
4 00.1 1 


Veterans’ Scores on Purdue Pegboard Test 39 


same trends are evident, with the women's norms being closer to those 
obtained in the present study, though even they appear to be somewhat 
lower. 

The final problem of the study was the intercorrelation between Peg- 
board Placing and Assembly, and the correlations between each of these 
and the Placing and Turning scores of the Minnesota Rate of Manipula- 
tion Test. Table 4 gives the relevant information on this point. The 
two scores on the Pegboard correlate with each other to about the same ex- 
tent as the Pegboard Placing score correlates with the two scores of the 
Minnesota test. As might be expected, the Assembly scores do not cor- 
relate as highly with the Minnesotascores. Itisevident that these coeffi- 
cients are relatively low. Whether they indicate low reliability in one 
or both of the tests or the likelihood that the various tests involve different 
factors, the present study affords no way of determining. 


Table 4 
Intercorrelation of Subtests of the Pegboard Test and Correlations of Each 
with Subtests of the Minnesota Rate of Manipulation 
Test for Male Veterans 


Subtests " N r 
0:3: a a үс ы ду S P 
Pegboard Placing vs. Pegboard Assembly 447 .53 
Pegboard Placing vs. Minnesota Placing 75 .50 
Pegboard Placing vs. Minnesota Turning 75 .59 
Pegboard Assembly vs. Minnesota Placing 75 .33 


Pegboard Assembly vs. Minnesota Turning 75 .86 
— "M Mu Ра твори асар МИЛЕ на ЗИ ара ну e 


Summary and Conclusions 


On the basis of а random sample of 850 male veterans who were 
seeking vocational guidance and who had taken the Purdue Pegboard 
test, percentile ranks were calculated for the various subtests, and these 
Were compared with norms furnished by the authors of the test. Using 
& sample of 447 cases the Assembly and (total) Placing scores on the 
Pegboard were intercorrelated, and finally with а sample of 75 cases, the 
Pegboard Assembly and Placing scores were correlated with Placing and 
Turning scores on the Minnesota Rate of Manipulation test. From the 
Study the following conclusions were drawn: 


l. Men's norms as furnished by the authors of the test are definitely 
lower than those obtained with the group of male veterans. 

2. Women's norms as furnished by the authors of the test are consider- 
ably closer to those of the male veterans’ group. 

3. The correlation between the Pegboard Plaeing scores and Assembly 
Scores was relatively low (.53). 


40 J. R. Strange and A. Q. Sartain | 


4. Correlations between Pegboard Placing scores on the one hand and 
Minnesota Placing and Turning scores on the other were also relatively 
low (.50 and .59 respectively). 

5. The Pegboard Assembly scores correlated with the Minnesota 
scores to a considerably smaller degree (.33 and .36 respectively). 


Received March 24, 1947. 


| 


High School Norms for the Grove Modification of the 
Kent-Shakow Formboard Series 


Ruth C. Wylie 
Connecticut College 


Alice W. Wilson 
Western Reserve University 


and 


William R. Grove 
University of Pittsburgh 


Grove! has described a modification of the Kent-Shakow Formboard 
Series which seems to be fairly homogeneous and which proved difficult 
enough to discriminate among adults. Wylie? has demonstrated that this 
test is reliable enough at the high school level to warrant further stand- 
ardization. 

Like other investigators in the field of performance testing, Grove 
found that raw time and move scores yield skewed distributions and 
non-linear correlations among themselves and with scores from other 
tests. 

Various more or less arbitrary means of weighting raw scores have 
been used to take care of these difficulties. Grove used a pragmatic 
weighting table to obtain scores which would yield fairly normal distri- 
butions. He fitted free-hand curves as a basis for rectifying his data, 
after having failed in more precise methods of mathematical curve fitting. 
The method used was further justified by the fact that the scores thus 
rectified yielded linear intercorrelations. 

The work reported in this paper deals with: (1) the development of a 
more adequate scoring rationale for use with the test and (2) the con- 
struction of high school norms for the test. 


Procedure 


Administrative Procedure. As part of a larger study, four perfor- 
mance tests and two group tests were administered to 352 boys chosen at 


1 Grove, W. R. Modification of the Kent-Shakow Formboard Series. J. Psychol., 
1939, 7, 385-397. 
: Wylie, Е. С. The reliability of the Grove modification of the Kent-Shakow Form- 
Series, J. appl. Psychol., 1947, 31, 155-159. 
41 


42 Ruth C. Wylie, Alice W. Wilson, and William R. Grove 


random from grades 7-12 of the Beaver Falls, Pennsylvania, Public 
Schools. The following tests were administered: 


. Grove Modification of the Kent-Shakow Formboard Series. 

. Block Designs Test from the Wechsler-Bellevue Scale. 

. Cube Construction Test from the Cornell-Coxe Performance Scale. 
. O'Connor's Wiggly Block Test.’ 

. California Test of Mental Maturity, Short Form. 

. Revised Minnesota Paper Formboard Test, Series AA. 


Characteristics of the Normative Group. Since the scores of this group 
of subjects were being used to develop norms for the Modified Kent- 
Shakow Formboard Series, it was necessary to inquire first into the gen- 
eral characteristics of this sample. 

Beaver Falls is an industrial community of approximately 18,000 
people, not far from Pittsburgh, Pennsylvania. The town is predomi- 


Table 1 
Means and Standard Deviations of Rate (Recesses per Minute) Scores on the Modified 
Kent-Shakow Formboard Series (352 Boys in the Junior-Senior 
High Schools of Beaver Falls, Pennsylvania) 


о олњ о м ~ 


Subtest, 
a ымы! c ANA 
А в С р 

Grade N Mean S.D. Mean S.D. Mean S.D. Mean S.D. 
7 54 8.44 2.83 217 122 241 126 110 0.57 
8 56 8.52 2.59 2.54 1.50 189 1.23 1.04 0.66 
9 62 9.68 2.46 2.96 1.44 248 1.06 1.29 0.53 
10 54 9.71 2.56 3.30 1.58 2.70 1.34 115 0.75 Ј 
11 63 9.80 2.58 3.53 1.51 2.90 1.09 1.20 0.71 
12 63 10.29 3.12 3.30 1.38 2.94 1.28 1.832 0.67 
9-12 242 9.88 2.71 3.27 1.49 274 119 1.24 0.07 
7-8 110 849 271 2.86 1.38 2.10 127 1.08 0.61 


ИМ | 2 ДНИ 


nantly middle class with a generous proportion of pupils of foreign ex- 
traction in the High School. 

The mean chronological ages progressed from 12 to 18 in accord with 
what one might expect in a random sample from these grades. J 

Tables 1, 2, 3, and 4 summarize the data obtained from the tests 
administered to this group. The mental ages obtained from the Cali- 
fornia Test of Mental Maturity tend to run slightly above those expected 
forthesegrades. "The agreement is close between the Block Design Scores и 
made by this group and Wechsler’s norms for this test. On the Revised 


? Later eliminated from consideration due to its unreliability. 


Kent-Shakow Formboard Series 43 


Table 2 


Arithmetic Means and Standard Deviations of Mental Ages* of the Subjects Used in 
Studying the Grove Modification of the Kent-Shakow Form- 


board Series at the High School Level 
M.A. M.A. 
Non-language Total Mental 
Factors Factors actors 
Grade Nt Mean S.D, Mean S.D. Mean &D 
7 53 128 15 13.2 1.6 13.1 1.7 
8 52 14.0 1.6 13.5 1.4 14.3 1.4 
9 49 16.0 1.6 15.6 33 15.7 1.6 
10 53 16.8 1.8 16.8 3.7 16.8 18 
11 52 16.8 23 17.1 3.2 16.8 2.2 
12 54 17.1 1.9 167 3.4 168 18 


* California Test of Mental Maturity Intermediate S-Form (Grades 7-8) and Ад- 
vanced S-Form (Grades 9-12). 

+N = 313 in this table, this being fhe total number of cases for whom California 
Test of Mental Maturity Scores were available. As far as is known no systematic 
selective factor operated in determining which of the 352 subjects missed the group 
testing sessions. 


Minnesota Paper Formboard our group tends to run a little higher than 
the Likert and Quasha norms, the greatest difference appearing in the 
group whose chronological ages run from 16-25. This difference may be 
due to the fact that our 16-19 year-old group was not strictly comparable 
in age range or school status to Likert and Quasha's 16-25 year-old group. 


Table 3 
Arithmetic Means and Standard Deviations on the Wechsler Block Designs 
for the Sample Used in this Study as Compared with 
the Norms Used by Wechsler 


Present Study Wechsler 

О Mean SD. N* Mean S.D. N 
11 8.1 1.9 8 1А 24 60 
12 9.8 2.6 33 9.1 2.5 60 
13 9.5 28 35 10.1 29 70 
14 10.4 2.8 55 10.7 2.9 70 
15 10.8 3.0 52 10.7 3.2 100 
16 11.2 2.5 64 10.9 3.2 100 

17-19 10.9 3.2 60 11.0 2.9 100 


*N = 307 in this table, this being the number of cases for whom the following data 
were available: California Test of Mental Maturity Scores, chronological ages (between 
11 years and 19 years), and Wechsler Block Designs Scores. 


44 Ruth C. Wylie, Alice W. Wilson, and William R. Grove 


On the whole, the sample used in this study seems to be rather similar 
to various other samples which have been used in previous standardi- 
zations of these supplementary tests. 


Table 4 
Comparison of Medians and Quartiles on the Minnesota Paper Formboard, 
Revised Series AA, for the Sample Used in This Study with 
the Norms Given by Likert and Quasha* 


_к_———ЄЄ—Є—Є—ЄЄ—Є—ЄЄ—Є—Є—Є—————— 


Present Study Likert and Quasha 
TR Ae rers s cvs А ЙУ 

Groups Median б Q: N Median Qı Q: N 
C.A. 16-25 

(males) 41 35 47 116 34 26 39 147 
C.A. 14-6 

to 15-5 

(males) 40 30 48 46 38 31 44 101 
C.A. 11-6 

to 12-5 

(males) 30 25 34 21 32 26 38 96 
High school 

seniors 42 35 47 53 39 33 45 1288 


* The groups included are the only ones from Likert and Quasha’s norms which 
were comparable to the subjects of the present study. . 


Use of the critical ratio and analysis of variance techniques showed 
that the grades 9-12 constituted a sufficiently homogeneous group with 
respect to the formboard to justify combining their formboard Scores 
into a “pool.” This pool was then used as the standardization group in 
constructing the weighting tables and the norms. (See Tables 5 and 6.) 


Table 5 


Summary of the Data on Variance for the Modified Kent-Shakow 
Formboard Series 
at _____ 
(Between the means of the recesses minute f 
grades 9, 10, 11 and 12, critical mis are: Fa = 265 


on arg. 
Subtests F ND 
eru cu MR Facto du c uo. 
i A 1.52 
B 1.50 
с 1.58 
р 1.22 


и A ee  - 


Kent-Shakow Formboard Series 45 


Table 6 


Summary of Critical Ratios for the Differences between the Mean Recesses 
per Minute Scores on the Modified Kent-Shakow 


Formboard Series 
Grades 
Subtest 9-12 vs. 7-8 9-12 vs. 8 Тув.8 
А 4.48 3.48 0.15 
B 5.69 3.32 1.43 
с 4.57 4.47 2.17 
р 2.28 2.00 0.50 
Results 


Scoring Rationale. Although both time and moves were recorded for 
each subtest for each individual in the standardization group, we decided, 
for several reasons, to discard moves as a scoring element. First, Wylie 
(loc. cit.) had demonstrated that scores based on moves were less reliable 
than scores based on time, and that moves combined with time did not 
improve the reliability of the test. Second, the correlation between rec- 
tified time and moves scores was +.84. Finally, moves are more difficult 
to count and record than time. For these reasons, it was decided to base 
the scoring upon time alone. 

As noted earlier, Grove (loc. cit.) had demonstrated that raw time scores 
were unsatisfactory. It was decided to try rate scores. The basic scor- 
ing unit adopted was the number of recesses correctly completed per 
minute. For each subtest rates were calculated by dividing the number 
ог recesses correctly completed by the number of minutes taken to com- 
plete them. "Thus, since each subtest consisted of five recesses, a subject 
taking five minutes to complete any given subtest would have a rate 
Score for that subtest of one recess per minute. Similarly, if a subtest 
was completed in one minute, the rate score would be five recesses per 
minute; while if it were completed in thirty seconds, the rate score would 
be ten recesses per minute. 

Тће use of these rate scores proved to be a satisfactory solution to 
the difficulties encountered with raw time scores. First, it was found 
that rate scores bore a substantially linear relationship to the pragmatic 
weights used by Grove to rectify his raw time scores. At each extreme 
of the table, Grove’s weights deviated slightly from a linear relation- 
ship with rate scores, but through the middle range the linearity was 
Satisfactory. Thus the rate scores were functionally comparable to 
Grove’s pragmatic weights through the major portion of his scoring table. 
Second, the method of rate scores yielded more symmetrical distributions 


46 Ruth C. Wylie, Alice W. Wilson, and William R. Grove 


for all subtests than the distributions obtained from raw time всогев. 
Third, linear intercorrelations between the various subtests and between 
the subtests and other tests resulted from the use of rate scores. In 
contrast, the same regressions were non-linear when raw time Scores were 
used. Finally, the use of rate scores provides a satisfactory rationale 
for scoring partially completed subtests on а basis similar to completed 
subtests. Thus, rate scores for partial success (completion of one to four 
recesses within the time limit for any subtest) can be computed in terms 
of the same basic unit as for the fully completed subtests. ‘This method 
obviates the necessity for assigning arbitrary or pragmatically weighted 
point values to partial successes as previous performance test investiga- 
tors have done. 

Since the Modified Kent-Shakow Formboard Series consists of sepa- 
rate subtests, some method of weighting the subtests was necessary. If 
the subtests were permitted to weight themselves upon the basis of the 
rate scores obtained for each, undue weight would be given to the easier 
subtests at the expense of the more difficult. 

Ideally the weights assigned to the subtests should be determined on 
the basis of a validity study, ie. against an external criterion. How- 
ever, a validity study was beyond the scope of the present work. Wilson‘ 
had tried assigning weights to each subtest in direct proportion to its 
relative difficulty. Cases in the pool 9-12, scored according to this 
method, yielded a distribution which was a poor fit for the normal curve. 
Other possible methods were statistically explored. We were impressed 
by the very high intercorrelations obtained between these various 
methods of weighting the subtests. We accordingly decided that the 
simplest and most direct method was to let each subtest contribute 
equally to the total score. 

It may seem to the reader that Such a scoring procedure would be 
cumbersome and time consuming in practical use. This would indeed 
be the case except for the fact that the entire process has been reduced 
to the use of a single conversion table. (See Table 7.) This table lists 
the weighted scores to be assigned for each subtest, depending upon the 
number of seconds taken by the subject on completed subtests, or the 
number of recesses complete at the time limit, in the case of unfinished 

subtests. If no recesses were completed at the time limit, the score for 
that subtest was zero. The scoring table has the further merit that it 


* Wilson, А. W. Educational norms for high school boys on the Modified Kent- 
E Ме Series. Unpublished Ph.D. Thesis University of Pittsburgh 
гд р. 


Kent-Shakow Formboard Series 47 


Table 7 
Weighted Scores Corresponding to Raw Time Scores for the Modified 
Kent-Shakow Formboard Series 
(242 High School Boys, Grades 9-12)* 


кее _———Є——Є———Є—Є—Є——————————Є— 


Weighted Subtest Subtest Subtest Subtest 
Score A B С р 
a No GALLI Hue up ox 
1 = 
2 I 
3 II 
4 III and IV 
5 90-120 
-———————————— 
6 71-89 I I — 
7 59-70 II, III, IV II — 
8 50-58 255-360 III and IV I and II 
9 43-49 178-254 203-240 III and IV 
10 38-42 138-177 159-202 401-480 
п 34-37 112-137 132-158 307-400 
12 31-33 94-111 112-131 249-306 
13 29-30 82-03 97-111 209-248 
14 26-28 72-81 86-96 181-208 
15 24-25 64-71 77-85 159-180 
16 23 58-63 70-76 142-158 
17 21-22 53-57 64-69 129-141 
18 20 49-52 59-63 117-128 
19 19 45-48 55-58 108-116 
20 18 42-44 51-54 100-107 
21 17 39-41 48-50 93-99 
22 16 37-38 45-47 87-92 
23 ae 35-36 43-44 81-86 
24 15 33-34 40-42 77-80 
25 14 31-32 38-39 73-76 
26 э 30 37 69-72 
27 13 28-29 35-36 65-68 
28 — 27 33-34 62-64 
29 12 26 32 59-61 
30 = 25 31 57-58 


ч These Scores have been statistically adjusted to give total scores which are standard 
Scores with a mean of 50 and an S.D. of 10.1 for our normative group. 


The Steps that in Table 7 are во conveniently telescoped into one 
simple operation, are illustrated by the example shown in Table8. This 
example is based on the raw scores of a single subject of our normative 
Soup. Column two gives the raw scores made by the subject. In 


48 Ruth С. Wylie, Alice W. Wilson, and William R. Grove 


Table 8 
Scoring Example 
PET. ч F 
8i Weighted 
Rate Score Deviation Boe 
Recesses of Subjects’ from 
Raw Rate Score Col. I, 
Subtest Score Minute from Mean Table 3 
A 48 sec. 6.25 — 1.34 9 
в 114 вес. 2.63 —0.43 11 
с 200 зес. 1.50 — 1.04 10 
р 3 recesses 0.38 —1.28 9 
at time 
limit 


Total Weighted Score = 39 


MM MÀ act cern i 


column three, these raw scores are converted into rate scores. To illus- 
trate how these rate scores were computed we may note that the five 
recesses of Subtest C were completed by this subject in 200 seconds or 
3.33 minutes. Thus, the subject's rate score for Subtest C was 1.5 re- 
cesses per minute (5/3.33). On Subtest D the time limit is eight min- 
utes, and in that time the subject completed three recesses. 'Thus, in 


Table 9 
Distribution of Total Weighted Scores Obtained on the Modified Kent-Shakow 
Formboard Series by the Standardization Group 
(242 High School Boys) 


=—— 
Total Score Frequency 
75-79 3 
70-74 2 
65-69 10 
60-64 20 
55-59 41 
50-54 52 
45-49 53 
40-44 32 
35-39 17 
30-34 7 
25-29 0 
20-24 onm 
15-19 1 
10-14 1 
5-9 1 

А О ын o аас 

Меап 50.0 


—Á 


m— À—— и 


3 Kent-Shakow Formboard Series 49 


Bubtest D, he was working at the rate of 0.38 recess per minute (3/8). 
Column four lists the sigma deviation of each of this individual's rate 
scores from the mean of the normative group. These values were ob- 
tained by use of Table 1. If these sigma deviation values were algebra- 
ically totalled we would have a score which would give equal weight to 
each subtest. In column five the same result is more expeditiously ac- 
complished by using the scoring table (Table 7) in which equal weight 
is given to each subtest with the added advantage that the total weighted 
score is actually a standard score. Thus, the individual in our example 
obtained a total weighted score of 39 which is 1.1 sigmas below the mean 
of high school boys. By using Table 7, one can thus avoid doing such 


Table 10 


Percentile Scores for the Standardization Group Corresponding to Total Weighted 
Scores on the Modified Kent-Shakow Formboard 
(242 High School Boys) 


Total Score Percentile Total Score 


© 


74 
70 
68 
67 
66 
62 
58 
55 
53 
50 


55538852528 d 
Е; 
ооњо5 358 5 8 


SNSSesese 


computations as are illustrated in columns three and four of the example, 
yet achieve comparable results. 
Norms for High School Boys. As has already been explained, our 
Scoring table (Table 7) has been so constructed as to incorporate 
norms for a homogeneous and representative group of high school 


The actual distribution of total weighted scores obtained for our 
Group is presented in Table 9. This distribution is а good fit for the 
normal curve (chi square test). Table 10 has been constructed to give 
the Percentile values corresponding to the standard scores. 


Discussion 


Practical experience with the formboard and correlations between 
the formboard and other tests, support the hypothesis that this test 
ап aspect of general intelligence, but will probably be found 


M 
Y у 
и 


50 Ruth С. Wylie, Alice W. Wilson, and William R. Grove 


more heavily weighted with the spatial visualization factor (Wylie, loc. 
cit). Validity studies are now badly needed to increase further the use- 
fulness of the series. y 

Although this test requires bulky and expensive equipment, and must 
be individually administered, it has some very good features which should 
commend it to the clinician and counselor: 


1, It has intrinsic interest for adult subjects; 
2. It is self-corrective; 5 . 
3. It has an apparent validity which commends it to the mechanic 
or engineer who may doubt the efficiency of pencil and paper tests; | 
4. It does not penalize the individual who lacks verbal skills Ў 
5. It сап be administered with relative ease to а completely deaf 
person; 
6. It is apparently able to discriminate adequately over a wide range 
of ability; 
7. As performance tests of this type go, it is a reliable instrument; 
8. It takes a relatively short time for administration and scoring 
(approximately fifteen minutes). 


The test is not now commercially available, but a manual is being 
prepared and it is hoped that a reliable test publisher can be interested 
in manufacturing and distributing the test. 

Received March 22, 1947. 


Visual Skills and Labor Turnover 


N. C. Kephart 
Division of Education and Applied Psychology, Purdue University 


A relationship has been repeatedly shown between visual skills of 
workers and such industrial problems as production efficiency, quality 
of work, training time, safety, etc. A further difficulty in industrial 
personnel procedure is the problem of labor turnover. Much needless 
expense is inccurred in the training and breaking in of new employees 
who after such training remain on the job only a short time. Industri- 
alists are therefore interested in determining the factors which cause 
new employees to quit after a short period on the job and as far as possible 
eliminating these factors so that employees can be hired with a fair as- 
surance that they will stay on the job until their work has become pro- 
fitable both to themselves and to the company. The present paper will 
show how, in a specific company, vision was found to be one of the factors 
contributing to short tenure of workers. 

À large manufacturer of optical goods noted а decidedly high rate of 
turnover in his lens inspection department. The jobs in this depart- 
ment involved the inspection of eye glass lenses for surface quality. 
After cleaning the lens, the operator looks for physical defects such as 
chips, scratches, holes, bubbles and so on. This work is highly repetitive 
but requires more than the average visual attention since many of the 
defects to be noticed are small and difficult to detect visually. 

As a part of the employment procedure in this company, each appli- 
cant for employment is given the battery of visual tests incorporated in 
the Ortho-Rater before he is hired.! 

Through this pre-employment testing program, visual test scores were 
available for all workers who had been placed on this job irrespective 
of the length of time they remained on the job. From the records a 
group of 32 workers who had remained on this inspection job for a period 
9f 8 months or more was identified as a sample of workers remaining on 
the job. Similarly a group of 67 workers who had been placed on this 
Job and who remained less than 4 months was identified as a sample of 
workers who left the job shortly after they were hired. These two groups 
é * The Ortho-Rater is a visual testing device manufactured by the Bausch & Lomb 

Ptical Co. for the visual classification and placement of industrial employees. 
51 


52 N. C. Kephart 


were chosen so as to represent reasonably extreme ends of the tenure 
distribution in order to determine whether available personnel and test 
data would differentiate between employees who did not stay on the job 
even long enough to complete the training and those who remained suffi- 
ciently long to justify their initial employment. By reference to the pre- 
employment test file, visual test scores were obtained for all workers in 
each of these two groups. The visual test scores of the workers who re- 
mained more than 8 months were compared with the scores of those 


STAYING LONGER THAN 8 MONTHS 


NUMBER OF WORKERS 


т 
5 
= 
ez 
Fi 
sr 
8 
54 
ge 1'2'3'4'5'6'7'8'9* 
2% 
25 VERTICAL PHORIA TEST SCORE 


Fig. 1. Distribution of scores on test of vertical phoria. 


workers who remained less than 4 months to determine the relationship 
between visual skill and length of tenure on this job? 

Figure 1 shows the distribution of scores on the test of vertical phoria 
at the optical equivalent of 26 feet. Under certain test conditions, which 
eliminate the necessity for the eyes to converge on a single point, the eyes 
assume a posture that may converge or diverge from that required in 
normal seeing at the test distance. Such postures, measured in terms of 
angular deviation from that required normally for that distance, are 
considered the phoria condition. In the battery of tests used in this 


*The method of analysis used in this comparison has been described previously. 
Tiffin, Joseph, and Wirt, S. E. Determining visual standards for industrial jobs by 
statistical methods. Trans. Amer. Acad. Ophthal. and Otolaryn., 1945, 50, 72-93. 


Visual Skills and Labor Turnover 53 


study such postures are measured in both a vertical direction and lateral 
direction. "There are, therefore, two such phoria tests, a vertical phoria 
and а lateral phoria. 

From the distribution in Figure 1 it can be seen that the number of 
individuals scoring at either extreme on this test is greater among those 
workers who remain less than 4 months than among those who remain 
more than 8 months. Only 3 individuals scored 8 or 9. None of these 3 
individuals remained on the job more than 4months. Only 11 individuals 
scored less than 5. Of these, 10 remained on the job less than 4 months 
and only 1 more than 8 months. 


STAYING MORE THAN 8 MONTHS 


NUMBER OF WORKERS 


7 

z Г 

о 4 

> 
gz 
x= 
$5 
M. '1'2'3'4'5'6'7'8'9'I0' i'i2'l3' 1415 
o LATERAL PHORIA TEST SCORE 
FE 
ая 
25 


Fia. 2. Distribution of scores on test of lateral phoria. 


The standard deviation of scores for those individuals remaining more 
than 8 months was .75. The standard deviation of scores for those in- 
dividuals remaining less than 4 months was 1.07. This is a difference of 
32 which is 2.46 times its standard error. It thus appears that workers 
Who remain less than 4 months are more variable in vertical phoria than 
are workers who remain more than 8 months. This conclusion can be 

rawn at the 2% level of confidence. 

Figure 2 shows the distribution of scores for the two groups on the 
вч ог lateral phoria at the optical equivalant of 26 feet. It will be seen 

hat 17 individuals show a score greater than 9 on this test. Of these 17 


РА М. C. Kephart 


individuals none remained on the job more than 8 months. The mean 
score of workers remaining on the job more than 8 months was 6.22. 
The mean score of workers remaining on the job less than 4 months was 
7.93. This is a difference of 1.71 which is 3.43 times its standard error. 
The difference between these means is significant at the .1% level. 

It would therefore appear that those workers who show a deviation 
in vertical phoria in either direction and those workers who show high 
scores in lateral phoria (towards the exophoric end of the scale) more 
chacteristically remain on the job less than 4 months? 

Since this lens inspection job requires prolonged visual concentration 
it is not unexpected that these phoria tests should show a relationship 
with tenureonthisjob. Deviations in phoria are characteristicall y associ- 
ated with visual discomfort. Where lack of visual acuity characteristi- 
cally interferes with performance on a job, the chief symptom of a phoria 
is characteristically discomfort to the worker. The easist way to allevi- 
ate such discomfort would be to quit this particular job or to request 


Table 1 | 
S6— ooo 
Total Pass Fail 
Stay 32 32% 31 43% 1 4% 
Quit 67 67% 42 57% 25 96% 
99 73 26 


у ee ES et Sd 


transfer to another type of work. It might therefore be expected that 
the relationships between phoria tests and turnover would be closer than 
the relationships between acuity tests and turnover. Among this group 
of workers no significant relationship could be found between the visual 
acuities and tenure on the job. 

Those individuals who scored less than 5 or more than 7 on the verti- 
cal phoria test and those who scored more than 9 on the lateral phoria 
test were identified and their length of tenure compared with that of those 
workers whose scores did not fall in these undesirable ranges. The re- 
sulting figures are shown in Table 1. 


It will be seen that 43% of those individuals whose test scores fell 


А 3 In general, distance acuity tests show greater correlation with success on distance 
jobs and near acuity tests show greater correlation with success on near jobs. Distance 
acuity tests have shown low relationships with near acuity tests (Tiffin, Joseph. Vision 
and industrial production. Illuminating Engineering, XL, No. 4 (April 1945)). Simi- 
lar relationships, however, have not in general been shown for the distance and near 
phoria tests. present finding, therefore, that distance phoria tests are related to 


turnover on a “rear” job is not inconsistent with these earlier findi deali imaril 
mari 
with acuity tests, Yael ў 


| 


Visual Skills and Labor Turnover 55 


within the desirable ranges remained on the job more than 8 months 
while 57% remained on the job less than 8 months. Оп the other hand, 
among those individuals whose test scores fell within the undesirable 
ranges, only 4% remained on the job more than 8 months while 96% 
remained on the job less than 6 months. 

This relationship was also investigated by the computation of the 
tetrachoric r. This value was .71. 

These data would indicate that for visual inspection jobs such as 
the one studied here, there is a definite relationship between visual phorias 
and job tenure. Those individuals who deviate from ortho-phoria in 
either direction vertically and toward exophoria laterally are much more 
apt to leave the job early. These relationships between the phorias and 
tenure are statistically significant both individually and in combination. 


Received May 24, 1947. 


"Едп!, “sv. Resaarch 
ING COLLEGE 


The Validity and Reliability of Heterophoria Scores Yielded 
by Three Commercial Optical Devices * 


J. H. Sulzman, M.D. 
Troy, New York 


Lt. Comdr. E. B. Cook, USNR 
Naval Medical Research Laboratory, New London, Conn. 


and 


N. R. Bartlett 
Johns Hopkins University 


An earlier communication (1) discussed the relaibility of visual acuity 
measurements with the Bausch and Lomb Ortho-Rater, the American 
Optical Sight Screener, and the Keystone Telebinocular. That report 
also presented data on the agreement between visual acuity scores with 
these devices and scores obtained with a standard acuity test. The 
evaluation of these three instruments is extended in this note to the 
measurement of heterophoria. The same 121 observers and the same 
conditions of testing were used in this second phase of the study as in the 
_first, since the data were recorded simultaneously. The testing procedure 
is sketched in its essentials in the previous article (1) and is described 
in detail in the original Navy research reports (2, 3) from which these 
communications are abstracted. To summarize briefly, each person of & 
representative population of observers was examined twice with each 
of the three instruments and twice with a standard clinical test, with the 
order of testing randomized. Then for each method under investigation: 
the test-retest reliability coefficient, the distribution of scores for the 
initial test and for the second test, and the coefficient of correlation of 
the test scores with the criterion standard measure were computed. 
This evaluation is based entirely on these statistical indices. 


The Criterion Clinical Test 


Several tests for heterophoria are used by ophthalmologists in clinical 
practice. Three used widely are the Maddox rod, the Screen-Maddox 
* This investigation was carried out at the Medical Research Laboratory, Sub- 
marine Base, New London, Conn. The opinions expressed herein are the private views 
of the authors, and are not to be construed as official or as representing the Naval 
service at large. 
56 


Validity and Reliability of Heterophoria Scores 57 


rod, and the Sereen and Parallax tests. The three appeared to have 
about the same face validity. A preliminary experiment with one hun- 
dred subjects was conducted therefore to settle upon one of the three 
asa criterion. The experiment showed the three to be highly intercor- 
related, so a choice on the basis of test-retest reliability, brevity and sim- 
plicity was indicated. The measure chosen was the Maddox rod test. 
Statistical data for this test are presented later in Tables 1 and 2. 

In the Maddox rod test, а subject sees a small spot of light with his 
left eye, and with his right eye sees the same light as a vertical streak 
displaced to one side. Ordinarily the subject does not realize that actu- 
ally he is seeing each object monocularly. He is shown that the line of 
light may be displaced laterally by rotating a Risley prism, and then he 
is asked to adjust the prism until he is satisfied that the line passes 
through the center of the spot. The rotation is made rapidly by most 
observers without pausing or reversing the direction of movement, and 
the first setting for each observer is reported ordinarily to be in good coin- 
cidence. The score for the test is based on the lateral deviation (in 
prism diopters) introduced in achieving coincidence. For the vertical 
measure of imbalance, the Stevens phorometer attachment is used instead 
of the Risley prism and the Maddox rod is rotated to present a horizontal 
streak instead of a vertical streak. 

Maddox rod tests for lateral and vertical phoria were conducted in 
this study with the far target at twenty feet, and the near target at 
thirteen inches. The former was at the same level as the eyes of the 
observer, and the latter six inches below eye level. This depression ap- , 
proximates the eye-muscle conditions characteristic of ordinary reading. 
Altogether, four phoria measures are yielded by this Maddox rod pro- 
cedure; a vertical and a lateral imbalance score when the eyes are focussed 
on the target at distance, and a vertical and a lateral score when they are 
converged on the target close at hand. 

Table 1 presents means and standard deviations of the four scores 
for the initial and for the second tests of the sample of 121 observers. 
Results are expressed in conventional terms of prism diopters. Latent 
deviations from optical alignment in the lateral plane are called exophoria 
(X) if the test indicates the eyes have a tendency to diverge, and esopho- 
Па (E) if they tend to converge. On the other hand, latent deviations 
in the vertical plane are designated according to the eye which tends to 
deviate above the other; thus, RH denotes hyperphoria for the right 
eye, and LH hyperphoria for the left. 

ü No large systematic changes in score distributions from the initial 
Sn Second test are revealed in Table 1. The extent to which such 
anges occur, and the degree to which relative scores for a population are 


58 J. Н. Sulzman, E. B. Cook, and У. R. Bartlett 
Table 1 
Test and Retest Means and Standard Deviations for Maddox Rod Test 
Means Standard Deviations 
Test Retest Test Retest 
Far Distance 
Lateral E 1.30 E 1.12 3.78 3.89 
Vertical LH 0.08 RH 0.28 0.54 0.39 
Proximate Distance 
Lateral X 3.84 X 3.04 6.02 5.65 
Vertical RH 0.50 RH 0.50 0.61 0.57 


consistent from one test to another as indicated by test-retest correlation 
data, together afford an evaluation of reliability. 

Reliability and intercorrelation coefficients for the four tests are shown 
in Table 2. Those in parentheses are test-retest coefficients. The latter 
are high, but of course are not high enough to warrant much reliance on 
a single administration of the test for individual prediction or diagnosis. 
On the other hand, whatever their value, these reliability coefficients can 
serve as a yardstick for assessing corresponding data for the three in- 
struments. The small magnitude of the remaining coefficients in the 
table, indicating the relationship of the several tests to each other, signi- 
fies relative independence of the four separate indices of imbalance. 


Commercial Devices Under Comparison 
As stated in the previous section, the Maddox rod test displays a spot 
of light to the left eye and a streak of light to the right eye. The same 


general principle of presenting dissimilar targets to the two eyes is em- 
ployed in all three of the instruments studied. 


While the testing procedure with each of the three is essentially the 


Table 2 


Reliability and Intercorrelation Coefficients for Maddox Rod Tests 
= ————————————————————=—© 


Far Distance Proximate Distance 
Lateral Vertical Lateral Vertical 
Far Distance 
Lateral (0.793) 
Vertical 0.151 (0.623) 
Proximate Distance 
Lateral 0.668 0.155 (0.867) 
Vertical 0.173 0.446 0.159 (0.677) 


LEE T MM iio rese 


Validity and Reliability of Heterophoria Scores 59 


same, the Sight-Screener differs in one major respect from the other two. 
With all three devices, two dissimilar objects are seen, one with each eye; 
one object appears stationary, while the other may appear to drift. The 
drifting object seen by one eye is made in the form of an indicator, which 
appears to point to some one of a row series of dots or steps seen with the 
other eye. The subject reads the value at which the free indicator finally 
comes to rest in the method prescribed for both the Telebinocular and 
the Ortho-Rater. In the Sight-Screener procedure, however, the ex- 
treme excursion to which the indicator swings is read instead of the posi- 
tion at which it comes to rest. 

In addition to this procedure difference, there are а few mechanical 
differences in the devices. Telebinocular targets are illuminated from the 
front, while the Sight Screener and Ortho-Rater targets are trans-illumi- 
nated from behind. Decentered convex lenses are used for viewing in 
every case except with the Sight Screener for proximate distance; in the 
latter instance, targets are viewed directly without the interposition of 
prismatic lenses. In addition, the Sight Screener is unique among the 
three in that the Polaroid Vectograph principle is employed for presenting 
Separate targets to each eye. Finally, at the time of this investigation, 
there was no Telebinocular test available for near vertical phoria. 


Statistical Data and Evaluation 


Test and retest means, standard deviations and test-retest reliability 
coefficients are presented in Table 3. Score units are expressed in the 
Taw scale units of the respective instruments. 

i On the basis of comparative reliabilities, as estimated from the stabi- 
lity of test and retest means, from the constancy of test and retest stand- 
ard deviations and from test-retest reliability coefficients, there seems to 
be no clear-cut argument for choosing any one of the instruments in 
Preference to any other. 

The degree of relationship between the first test and the first cor- 
responding Maddox rod test is presented for each instrument in Table 4. 
For purposes of this comparison, that relationship is an index of validity. 
The validity coefficients, test-retest reliability coefficients, and the va- 
lidity coefficients corrected for attenuation are shown. 

The raw validity coefficients in Table 4 (column 2) are low, and fur- 
thermore are considerably smaller than the reliability coefficients. So 
the validity indices after correction for attenuation are still well below 
ad Evidently, then, the instruments do not tap the same psycho- 
Physiological functions as do the corresponding four Maddox rod tests. 

о hypotheses are offered to account for the fact that the same functions 
are not involved; many possible reasons might be proposed to explain it, 


60 J. H. Sulzman, E. B. Cook, and У. R. Bartlett 


but the fact itself is the important thing. But before any interpretations 
about validity are made, the criterion should be examined. It must be 
emphasized that the choice of the Maddox rod test for this purpose was 
arbitrary. Probably the Screen-Maddox rod or the Screen and Parallax 
tests would yield validity coefficients similar to the Maddox rod test, for 
the three are closely interrelated. But the principal argument for ems 
ploying any one of the three rests on widespread clinical usage. he 
validity figures might have been different had some other kind of criterii 
been used. 


Н 


Table 3 
Test and Retest Means, Standard Deviations and Reliability Coefficients 
for the Three Devices 
Means Standard Deviations M. 
Test Retest Test Retest Coeffi 
Ortho-Rater 
Far Lateral 6.79 6.59 272 2.56 
Far Vertical 5.17 5.11 1.07 1.01 
Near Lateral 8.08 751 276 269 
Near Vertical 4.50 4.59 0:83 0.83 
Sight Screener 
Far Lateral 14.94 1460 148 192 
Far Vertical 3.06 3.02 0.37 029 
Near Lateral 17.07 16.30 3.33 2,97 
Near Vertical 3.02 3.06 034 0.39: 
Telebinocular 
Far Lateral E148 E220 231 3.58 | 
Far Vertical 471 473 084 0.76 . 
Near Lateral E048  ELI6 5.60 54 .880 _ 
———————— MÀ DG 


: In general, the correlation between instrument measures for lateral 
imbalance and corresponding Maddox rod scores is greater than for verti- 
cal measures. Furthermore, the reliability coefficients reveal a greater 
consistency for tests of lateral phorias than for tests of vertical phorias- 
Of course, this finding may be attributed to the greater relative homo- 
geneity of the population with respect to vertical imbalances; for example; 
the standard error in predicting retest scores is less for vertical phorias 
than for lateral. However, if standard score scales were to be construc 
separately for lateral phorias and for vertical phorias on the basis of th 
population, then the test-retest consistency in terms of these standa 
scores would be much greater for lateral measures than for vertical. 

_ Table 4 affords a direct comparison of test-retest coefficients of 
clinical tests with the coefficients for the several corresponding ins 


Validity and Reliability of Heterophoria Scores 61 


ments. This comparison reveals no loss in reliability in substituting the 
machine for the clinician. 'The only question to be settled before rec- 
ommending the use of the more expeditious and less troublesome me- 
chanical test in place of the clinical procedure is that of comparative 
validities. So long as one holds that the mechanical devices must meas- 
ure exactly the same functions as does the conventional Maddox rod 
test, then the devices cannot be recommended without reservations. 
Perhaps а more reasonable statement is that the machines when used 
with the manufacturer's recommended procedure present tests differing 


'Table 4 
Reliability Coefficients of the Clinical Measure and the Instrument Measure 
and the Intercorrelations between the Two 


Coefficient of 


Coefficient. Interrelation Coefficient Inter- 
of Relia- between of Relia- correlation 
bility of Instrument bility of Coefficient. 
ical and Clinical Ins! t Corrected for 
Name of Test Measure Measure Measure Attenuation 
Ortho-Rater 
Far Lateral 0.793 0.564 0.872 68 
Far Vertical 0.623 0.286 0.630 46 
Near Lateral 0.807 0.674 0.924 75 
Near Vertical 0.677 0.343 0.624 53 
Sight Screener 
Far Lateral 0.793 0.370 0.796 47 
Far Vertical 0.623 0.279 0.613 45 
Near Lateral 0.867 0.543 0.831 64 
Near Vertical 0.677 0.337 0.551 55 
Telebinocular 
Far Lateral 0.793 0.371 0.755 48 
Far Vertical 0.623 0.426 0.602 70 
Near Lateral 0.867 0.683 0.880 78 


пита 2. РОМАНИ руља амеба TY 
in some unknown fashion from the conventional clinieal procedure, but 
that on the other hand they are at least as reliable as the clinical method. 


Conclusions 

Perhaps the most useful conclusion is that this comparative evaluation 
of heterophoria measurements with the three instruments does not indi- 
tate any clear-cut basis for preferring any one instrument to the other 
m Secondly, no one of the three offers sufficient reliability to warrant 
„пе use of a fine scale for lateral or vertical phoria when the test is admin- 
istered only once. Thirdly, the machines offer reliabilities at least as 
Breat as certain clinical tests now accepted and in use. Finally, the 


62 J. Н. Sulzman, E. B. Cook, and М. R. Bartlett 


question of validity is not settled, for the scores with the machines do 
not correlate to a satisfying degree with scores on one standard clinical test. 


Received April 8, 1947. 


References 


1. Sulzman, J. H., Cook, E. B., and Bartlett, N. R. Reliability and validity data for 
certain commercial devices for measuring visual acuity. J. appl. Psychol. 
(In Press.) 

2. Sulzman, J. H., Cook, E. B., and Bartlett, N. R. Visual acuity measurements with 
three commercial screening devices. Bur. Med. Surg. Research Project No. 
X-493 (Av-263-p), Progress Report No. 2; 7 Feb. 1946. | 

3. Sulzman, J. H., Cook, E. В., and Bartlett, N. R. Comparative study of measures 
of heterophoria. Bur. Med. Surg. Research Project No. X-493 (Av-263-p), 
Progress Report No. 3; 22 Feb. 1946. 


Construction and Use of Weighted Check-List Rating 
Scales for Two Industrial Situations * 


Edwin B. Knauft 
State University of Iowa 


Several previous investigators (1, 2, 3,) have used the Thurstone 
equal-appearing intervals method to select and weight items for use in 
a check-list type of personnel rating device. However, in each of these 
instanees the obtained items were then subjected to an extensive item 
analysis procedure which served as the basis for the final form of the 
rating device. Since such item analyses generally require an independent 
criterion measure of the ratees’ job performance and involve a consider- 
able expenditure of time and labor, the use of a simplified technique 
should extend the use of this type of rating method to a broader range of 
industrial situations. The present paper describes the relatively simple 
procedures involved in the construction of rating scales of the check-list 
type by the use of the equal-appearing intervals method. These tech- 
niques have been applied to the construction of rating devices for the two 
very different jobs of laundry press operator and bake shop manager. 


Construction of a Scale for Laundry Press Operators 


The rating scale for laundry press operators' was originally con- 
structed in connection with a validation study of a selection test for these 
Operators. The problem was somewhat unique because there is an aver- 
аде of only eight press operators employed in the typieallaundry. Since 
Satisfactory test validation was impossible in any one laundry due to 
the small number of cases, it was necessary to obtain data on operators 
from a number of different laundries under different managements. It 
Was not possible to obtain satisfactory objective measures of the pro- 
duction of each operator, and thus it was necessary to resort to a merit 
Tating method as the sole criterion of operator proficiency. The final rat- 
Ing device took!the form of a weighted check list of items which were 
elected and weighted by the Thurstone equal-appearing intervals 

anally presented in part at the 1947 meetings of the Midwestern Psychological 
on Get is indebted to Mr. H. H. Hauser for assistance in the collection of data 

orm of the press operator rating scale. 
63 


64 Едит B. Knauft 


method. The following steps were involved in the construction of the 
scale: 

1. Each of 25 laundry managers was asked to prepare a list of state- 
ments which described the performance of a press operator. Each 
statement was to express in a simple sentence one thing an operator did 
which would aid or hinder her production or the production of other 
workers around her. About 600 such statements were collected, but 
after editing and eliminating the duplicates, 197 statements were selected. 

2. 'The 197 statements were submitted to 27 laundry managers for 
sorting. Each statement was reproduced on a separate card and the 
managers were instructed to sort the statements into nine piles which 
represented nine points on a good-to-poor press operator continuum. 
Thus each statement was evaluated in terms of the value to the laundry 
management of the described behavior. 

3. The median scale value of each statement was computed on the 
basis of the position assigned to the statement by each of the 27 sorting 
managers. The semi-interquartile range of each statement was similarly 
computed. 

4. All statements with a Q value of 1 scale unit or more were dis- 
carded, leaving 150 statements for further analysis. 

5. The final scale was constructed from the remaining statements 
on the basis of the median scale value of each statement. Twenty-seven 
statements with scale values at approximate intervals of 0.3 scale unit 
were thus selected for the final scale. The statements with the smallest 
Q values were selected wherever possible. 

6. A second form of the scale was constructed which contained 27 
other items having approximately the same values as the statements in 
the first form. 

7. The items in each form were then placed in random order and the 
ratee was instructed to mark each statement as being true or not true 
of the operator being rated. The score for the individual ratee is ob- 
tained by computing the mean of the scale values of statements which 
were checked as being true or descriptive of that ratee. The scale values 


of the statements are not shown on the rating check list and are not 
known by the raters. 


The final items included in Forms A and B of the Press Operator 
Rating Scale are listed in Tables 1 and 2. These tables also give the 
scale or scoring value of each item. It will be noted that several items 
appear in both forms of the scale. This procedure was necessary because 
several intervals along the continuum were each represented by only one 
item of acceptable Q value. In such instances the same item had to be 
included in both forms of the scale. 


n4 


Weighted Check-List Rating Scales 65 


Table 1 
Items of Form A— Press Operator Rating Scale 


Scale 
Item Value 
efficiently utilizes all her press area in making lays. 8.3 
shows moderate interest in her work. 4.6 
Bhe is not very careful with her equipment. 28 
is fairly systematic in her work. 5.8 
works at a slow but steady pace. i 4.3 
is absent from work frequently. 11 
hard for her to change her ways of doing things. 4.1 
n change to a different work without much loss of speed. 7.3 
too many lays. 3.0 
ев few errors in lay sequence. 6.9 
her to change work. 3.8 
not give enough straight pull to the garment before closing the press. 6.5 
ork seldom requires any extra touch-up. 7.9 
fair at breaking in new workers. 5.7 
| careful to keep the garments clean. 7.0 
pats and smooths the lays more than necessary. 3.6 
seldom asks for any time off. 7.7 
a e tries to pass the blame on to others. 2.0 
| е is more interested in quality than speed. 64 
"t seem to get along with anyone. 0.6 
picks over items in the damp box too much. $3 
can turn out good work at high speed. 84 
uently threatens to quit. . 0.7 
ccepts constructive criticism well. 8.1 
does a sloppy job of pressing. 0.9 
never seems to know when to use the damp cloth. 24 
careful of her equipment. 7.9 


= 


D^ Results Obtained With Press Operator Rating Scale 
The rating scale was used by 15 different laundry managers to rate 
Press operators. Data were thus obtained on 118 ratees who were 
оп each of the two forms of the scale. The reliability coefficient 
сеје, based on the scores obtained on the two forms by the 118 
‚ was.87. The reliability coefficient for the full scale of both 
аз estimated by the Spearman-Brown prophecy formula, was .93. 
Composite rating score for each of these operators was obtained by 

g a mean score, based on the score the operator obtained on 
У h forms, This distribution of scores had a mean of 5.91 and a stand- 
deviation of 0.81. These scores were subjected to an analysis of 
ince to determine if there were significant differences between the 
з made by the 15 raters. The resulting F (of 1.02) is not significant 


66 Edwin B. Knauft 


Table 2 
Items of Form B—Press Operator Rating Scale 


She has the idea the work is “beneath her.” 
She does a thorough job of pressing. 

She is a trouble maker. 

She misses quite a bit of work because of illness, 

She gives every evidence of being interested in her work. 

You can never predict how she will take criticism. 

When she speeds up, her quality falls off. 

She uses poor judgment in the sequence of garments. 

She is fair at breaking in new workers. 

She doesn’t utilize every opportunity to use double lay. 

She makes some unnecessary movements in handling garments. 

She can satisfactorily operate any kind of press unit. 

She is an extremely slow worker. 

She literally drops the garment into proper lay on the press. 

She wants to be waited on. 

She is seldom distracted from her work. 

She is more interested in quality than speed. 

She is indifferent about the quality of her work. 

She uses good judgment in the use of the damp cloth. 

Her work requires more than normal touch-up. 

She always has a chip on her shoulder. 

She is occasionally late to work. 

She is fairly systematic in her work. 

She makes few errors in lay sequences, 

She does not give enough straight pull to the garment before closing the press. 
She does not load her presses to efficient capacity. i 
She has to be pushed continually. 1.6 


at the 5 per cent level of confidence, indicating that the differences be- 
tween mean ratings made by the several raters may be attributed to 
chance. 

It should be noted that a number of the ratings were made by raters 
who were located at distant laundries where personal contact and in- 
struction in rating procedures was impossible. These raters, who had 
no previous experience with formal rating methods, made their ratings ош 
the basis of written instructions sent to them by mail. This experience: 
Serves to demonstrate that extensive training of raters may not be а 
requisite with this type of rating method. 


A Rating Scale for Retail Bake Shop Managers E 


A second application of the weighted check-list type of rating scal e 
has been made in a study of bake shop managers employed by a lari 


Weighted Check-List Rating Scales 67 


bakery chain. Each of these managers has responsibility for the opera- 
tion of a retail bake shop in which all types of baked goods are produced 
and sold to the public. The items were collected and scaled by the same 
procedures as were used with the press operator scale with the one ex- 
ception that the statements for the manager scale were placed into nine 
categories by a modified and more simple technique. This simpler 
method requires each judge to evaluate each statement by encircling one 
of the numbers from 1 to 9 following the statement. The numbers refer 
to positions on a continuum which ranges from ‘‘descriptive of a manager 
of no value to the company" to “descriptive of а manager of very great 
value to the company." A study by Seashore and Hevner (4) has in- 
dicated that such a procedure will produce statement scale values which 
ате not significantly different from the values obtained by the more 
cumbersome card sorting procedure. 

One hundred and ninety-six statements were scaled by 17 judges 
who were either district managers, assistant district managers or home 


'Table 3 
Items of Form A—Bake Shop Manager Rating Scale 
Seale 
Item Value 

He occasionally buys some of his competitor's products. 6.8 
He never consults with his head salesgirl when making out a bake order. 14 
He belongs to a local merchants’ association. 49 
He criticizes his employees unnecessarily. 0.8 
The window display is usually just fair. 3.1 
He enjoys contacting customers personally. 14 
He does not know how to figure costs of products. 0.6 
Не lacks а long range viewpoint. 25 
His produets are of uniformly high quality. 85 
He expects too much of his employees. 2.2 
His weekly and monthly reports are sometimes inaccurate. 4.2 
He does not always give enough thought to his bake orders. 1.6 
He occasionally runs a selling contest among his salesgirls. 68 
Baking in his shop continues until 2 P.M. or later. 8.2 
He keeps complaining about employees but doesn’t remedy the situation. 0.9 
He has originated one or more workable new formulas. 6.4 
He sometimes has an unreasonably large inventory of certain items. 3.3 
Employees enjoy working for him. 16 
He does not, delegate enough responsibility to others. 28 

е has accurately figured the costs of most of his products. 7.8 

© wishes he were just a baker. p 
His Shop is about average in cleanliness. E 

6 18 tardy in making minor repairs in his sales room. 1.9 


He periodically samples all of his produets for quality. 8.1 
m CE МОНЦИ Но ПЦ Itc OM KCN, 3 WEIEEUU ————— 


68 Edwin B. Knauft 


Table 4 
Items of Form B— Bake Shop Manager Rating Scale 
Scale 
Пет Value 
His window display always has customer appeal. 8.5 
He gives his employees the reasons for his decisions. 6.7 
Products dropped on the floor are sometimes sold. 14 
He always gets his reports in on time. 7.9 
He rarely figures the costs of his products, 1.0 
He belongs to а local merchants' association. е 4.9 
His bakers tend to pass some of their work off onto him. 3.1 
He seldom forgets what he has once been told. 7.6 
He occasionally runs a selling contest among his salesgirls. 6.8 
He is slow to discipline his employees even when he should. 1.9 
He does not anticipate probable emergencies. 24 
His weekly and monthly reports are sometimes inaccurate, 42 
He should take more interest in merchandising. 3.5 
He is slow at making decisions. 44 
His bakers do not respect him. 0.8 
His sales per customer are relatively high. 74 
He has originated one or more workable new formulas. 6.4 
No baking is done in his shop after 12 noon. 0.6 
He encourages his employees to show initiative. 8.1 
He knows how but can't teach others. 2.5 
His shop is unusually neat and clean. 8.3 
He often has vermin and insects in his shop. 0.8 
: His salesgirls sometimes fail to use wax paper to handle goods in the sales room. 3.2 
He plays little attention to his customers. 1.6 


office officials thoroughly familiar with actual bake shop operations. 
T'wo equivalent forms of a check-list scale were constructed by methods 
similar to those used with the press operator scale. Each form contained 
24 items. Tables 3 and 4 contain the items comprising the two forms 
of the Bake Shop Manager Rating Scale. 

This scale was used to evaluate 79 bake shop managers. The basic ' 
data were obtained from the ratings resulting when each district manager 
applied both forms of the scale to his shop managers. The reliability 
of the scale, based on the product moment correlation of scores on the 
two forms, was .79. The reliability of the combined scale of both forms 
was estimated by the Spearman-Brown formula to be .88. Additional 
data were available on 35 of the above managers who were also rated 
by their respective assistant district managers. The reliability coeffi- 
cient of the scale, based upon the ratings of the thirty-five managers 
evaluated by two raters, was .81. 

An estimate was also made of the relationship between the rating 


Weighted Check-List Rating Scales 69 


scores and ranks obtained by asking each district manager to rank his 
store managers in order from best to worst. Although the resulting 
rank orders cannot be considered as entirely independent measures of 
the managers’ on-the-job-ability, a comparison of the rankings with the 
seale scores obtained by these ratees does furnish a rough validity esti- 
mate of the scale. A rank difference correlation was computed be- 
tween the rating scores and the rankings each district manager made of 
his managers. The average of these correlations, based on the ratings 
and rankings of 79 ratees by seven district managers, was .59. If the 
correlation of one rater (rho = .02) is omitted, the mean rho becomes 
.69 for the remaining six raters who represent 71 ratees. 

A composite score for each of the 79 ratees was computed on the 
basis of the rating resulting when each district manager applied both ` 
forms of the check-list scale to the store managers in his respective district. 
The mean of this distribution of scores was 5.51, with a standard devia- 
tion of 0.87 and a range from 3.51 to 7.08. An analysis of variance of 
these scores was made to determine if there were significant differences 
in ratings made by different raters. The resulting F value of 2.15 is not 
significant at the 5 per cent level of confidence, indicating that purely 
chance differences probably explain the differences in mean ratings be- 
tween the seven raters. 


Discussion 


The relatively simple techniques here utilized have yielded scales 
which compare favorably in reliability with those constructed and refined 
by more tedious and precise item analysis methods. The equal-appearing 
Intervals techinque as applied to merit rating scale construction has thus 
far received very limited application in industry in spite of several ad- 
vantages of this type of scale over conventional graphic or linear scales. 

hese advantages include: 


1. The application of a systematic method of utilizing the opinions 
of a large number of “experts” in the selection and weighting of the items, 
tather than relying on the judgment of one scale constructor. 

2. The ease with which two equivalent forms of the same scale may 
be constructed, enabling the investigator to obtain a precise reliability 
Measure of the instrument. 

3. An objective and rapid method of scoring the completed forms. 

4. The exact values or weights of each item are unknown to the raters 
and may not be readily deduced by them. 

5. Requires very few directions to the raters and no extensive training 
Program in rating techniques. 

6. May be used as an over-all criterion measure because it may be 


70 Edwin В. Knauft 


constructed to sample a large number of different aspects of the em- 
ployee's on-the-job behavior. 

7. In specific industrial situations this type of scale has exhibited 
satisfactory reliability and has yielded distributions of acceptable form 
and range. 


Summary 


1. The Thurstone equal-appearing intervals technique has been used 
to select and weight items for two check-list merit rating scales which 
were used to evaluate personnel. Unlike previous investigations of a 
similar nature, the present procedure does not make use of an inde- 
pendent criterion in the construction of the final scale. 

2. Two forms of a rating scale for laundry press operators were con- 
structed. Managers of 15 laundries rated 118 press operators on both 
forms of this scale. The reliability of the total scale of 54 statements as 
determined from the two forms was .93. 

3. In a similar manner a rating scale was constructed for managers 
of bake shops in a large chain. The reliability of this total scale, based 
on two forms, was .88 (N — 114). The ratings of pairs of raters cor- 
related .81. 

4. The rating scores obtained on the manager scale were subjected 
to an analysis of variance and the resulting F value indicated that purely 
chance differences probably explain the differences in mean ratings be- 
tween the several raters. Similar results were also obtained in the case 
of the press operator scale. 


Received May 22, 1947. 


References 

1. Ferguson, L, W. The development of an adequate method of appraisal. Amer. 
Psychol., 1946, 1, 279 (Abstract). 

2. Ferguson, L, W. The development of a method of appraisal for assistant managers. 
J. appl. Psychol., 1947, 31, 306-311. 

3. Richardson, М. W., and Kuder, С. Е, Making a rating scale that measures. Person. 
J., 1933, 12, 36-40. 

4. Seashore, R. H., and Hevner, К. A time-saving device for the construction of 
attitude scales. J. soc. Psychol., 1933, 4, 366-372. у 


Communication Between Management and Workers * 


Donald G. Paterson and James J. Jenkins 
University of Minnesota 


The general problem of communication between management and 
workers has been receiving increased emphasis in recent industrial person- 
nel literature. 


History of the Problem 


Roethlisberger (19), Roethlisberger and Dickson (20), Gardner (7), 
Mayo (12), and Pigors and Myers (15) may be taken as examples of the 
recent emphasis on the factory as a social institution in which problems 
of structure, status, coordination and cooperation of groups and commu- 
nication between top management and rank and file workers are stressed 
and discussed at length. For example, there are 21 references to com- 
munication in the index of Gardner’s book. But in these approaches no 
hint is given as to how written communications should be formulated or 
how the problem of readability can be attacked in an objective manner. 
In Pigors and Myers (15) the employee handbook is recommended as the 
Principal medium of communication with stress on content. The only 
reference to readability is that it be “clearly written” but no information 
is given as to how this can be accomplished or, if attempted, how the 
readability of the content may be measured. 

In the more comprehensive texts on industrial relations and person- 
nel work such authors as Yoder (26), Scott, et al. (21), and Watkins and 
Dodd (25) discuss the problem of communication between management 
and workers in a more practical manner in terms of the makeup and соп- 
tent of employee handbooks, house organs, employee magazines, and 
bulletin board notices. But here again, although the admonition may 
be given that these should be written and edited for clarity and interest- 

olding power, no information is given as to how this can be done or as 
to how readability may be determined. Watkins and Dodd do give one 


= * The writers have used the conventional term management and workers throughout 
his paper. Dr, Dale Yoder, Director of the Industrial Relations Center at the Uni- 
versity of Minnesota, suggests that a better terminology would be “managements and 
employees” because management is an abstraction. Furthermore, he believes that the 
like employees is to be preferred to workers since the latter includes all members of the 
T force (managers, self-employed, rank and file employees, and others). 
71 


72 Donald G. Paterson and James J. Jenkins 


hint, namely, that some employee magazines have failed because th 
have been “too high brow” for the man in the shop. 

Heron (8a) devotes a whole book to the problem of sharing infi 
tion with employees and, in our opinion, provides an excellent dis ji 
of the compelling reasons for the necessity of such sharing as a prin 
function in industrial relations work. Unfortunately, the technica 
problem of how-to-do-it, in the sense of insuring comprehensibility a 
interest-holding power of printed communications, is not touched upoi 

Filipetti, in his survey (3) of the literature of scientific manage 
from the days of Frederick W. Taylor through World War II, does ти 
mention the problem of communication. The index to his book does ш 
contain any items such as communication, employee handbooks, bullet 
board notices or readability. Presumably, information about scientif 
management procedures and techniques is transmitted from manag 
ment to workers by verbal means only. j 

Even Powell and Schild (18), writing for the American Managemen 
Association and attempting to tell management how to prepare ай 
publish an employee manual, fail to come to grips with the problem 0 
readability. Their manual summarizes the results of a survey of com 
pany practices and gives selected samples from “good handbooks.” T 
also states, “Try to make it attractive and as readable as possible” (1 
p. 22) but the description of a technique for testing the readability of th 
language of employee handbooks is conspicuously absent. 

The industrial psychologists do no better. Oakley (13), Tiffin (24), 
Maier (11), Burtt (2), and Poffenberger (17) do not even mention tht 
problem of communication nor such media as employee handbooks, house 
organs, or bulletin board notices. 

The strange thing is that applied psychologists in dealing with the 
psychology of advertising devote a great deal of space to the problem of 
making advertising copy readable. See Kitson (10), Poffenberger (16) 
and Burtt (1). Not only is the problem recognized by these authors 
sufficient facts and descriptions of methodology are given to aid the 
vertising copy writer in making his copy readily comprehensible to 
“average man.” As a matter of historic fact, Kitson (10) as early: 
1921 developed an objective method of measuring the differences in re 


* Recently, increasing attention has been given, however, by industrial relai 
experts to the problem of simplifying the communication of financial statemen: ‹ 
employees and by these same experts and personnel psychologists to the problem | 
arranging adequate means of communicating the feelings and attitudes of the rank 
file to top management through suggestion systems and attitude and morale 8 
In opinion polling a great deal of attention has been given to the wording of а! 


© a in will be well within the comprehension level of the "average man" (21, 


Communication Between Management and Workers 73 


ability or comprehensibility and applied it to newspaper and magazine 
copy. He used frequency with which one, two, three, etc. syllable words 
are found, coupled with the average length of sentences. He seems to 
have been the real pioneer. 

The above paradox is most striking in that Burtt and Poffenberger, 
when they shift from advertising to vocational, employment and in- 
dustrial psychology, fail to stress or rather fail to mention the problem of 
communication and hence give no hints as to how to measure the read- 
ability of copy in communication from management to the workers. 
Poffenberger (17) in his book on applied psychology devotes only 37 
pages to advertising yet stresses readability whereas he devotes 227 pages 
to vocational, employment, and industrial psychology without mention- 
. ing the problem of communication in the latter situations. Burtt in his 
568 pages on employment psychology (2) likewise fails to mention the 
problem although he devotes adequate space to the problem of reada- 
bility in his book on advertising.* 

The problem of readability, however, has been treated far more ade- 
quately by Smith, Lasswell, and Casey (22), by Gray and Leary (8) and 
more recently by Flesch (4, 5, 6). Again, as a matter of historic fact, 
W. S. Gray seems to have been alert to the need for an index of read- 
ability. As early as 1935 he had developed mathematical formulae for 
measuring degrees of readability and applied them in obtaining indexes 
of difficulty of 350 books. Unfortunately his method does not seem to 
have “caught on.” 

Of course, at a still earlier date Thorndike (23) and Horn (9) had 
produced word counts that were used not only by authors in writing text 
books for children but also by advertisers in simplifying the vocabulary 
of advertising copy. 

All of this historical introduction points to a “cultural lag" * in ap- 
plying to personnel work what has long been known in the field of ad- 
vertising and in the writing of text books for children and for the “average 
adult.” To be sure, the problem of communication between management 
and workers may have been regarded as of minor importance. But in- 


„Тһе senior writer pleads guilty to the same offense because when he taught adver- 
tising psychology and personnel psychology simultaneously during the 1920's he stressed 
bility in advertising but failed to recognize its applicability in personnel work. 

* The phenomenon of “cultural lag” described so ably by Ogburn (14) is important 
and ever present in any field of knowledge or activity. In psychology many instances 
could be cited. In this particular matter, the lag seems to have been prolonged by 
compartmentalized thinking in which people as consumers are recognized as of primary 
importance whereas people as workers are accorded a secondary role from the point of 
View of their needs for an understanding of the policies and practices followed by top 

ent, 


74 Donald С. Paterson and James J. Jenkins 


dustrial relations experts and personnel psychologists can no longer ignore 
the problem especially in view of the importance attached to commu- 
nication by those who are now stressing the social structure and human 
relations approach to management problems in business and industry. 
The effectiveness with which the Flesch formula can now be applied to 
the problem also forces the issue. 


А Sample Study 


In setting up а problem dealing with the selection of power sewing 
machine operators in a textile factory, the writers were impressed by the 
attempt of the personnel manager to put in the hands of applicants а 
printed statement in regard to employment opportunities. Here was a 
time saver. No longer need the preliminary interviewer give a verbal 
description of the Company and its hiring policies prior to application 
taking. Here was a conscious attempt to give each potential applicant 
а picture of the situation so that each could decide whether or not to 
put in an application. The applicant information sheet (Form A) is 
reproduced as Figure 1. " 

A glance at the wording suggested to the writers that it was probably 
pitched at a readability level far beyond the comprehension capacity of 
the average applicant for factory work. This led to an attempt to sim- 
plify the language and structure and to increase its human interest value 
without changing the ideas contained in Form A. This was done by 
following the rules of Flesch (6). The resulting Form B is shown as 
Figure 2. 


APPLICANTS ** GENERAL INFORMATION—FORM A 


Before completing your application it would be advantageous for each appli- 
cant to know something about our Sopar , the products we manutan 
ot; 


and the employment opportunities for experienced and inexperience 
persons. 


TYPE OF WORK—Our Company manufactures a very high quality line 
of women's and children's underwear, women's nightwear and slips. ji 


IS IT NECESSARY TO HAVE EXPERIENCE—Many organizations 
follow the policy of employing only experienced operators. Although it is 
desirable that а plicants have had some training, our records indicate that 
many of our highest-paid operators have had no previous experience or train- 
ing before they entered our employ. Sincere interest in this type of work is 
much more important than previous experience in some other organization. 


CAN ANYONE LEARN THIS WORK—We do not encourage everyone who 
makes application to enter this industry. We recognize that some applicants 
are not interested or suited for this work, just as there are women who are not 
suited by temperament to enter office wok. nursing, or some similar vocation. 
It would be unfair to you as an applicant, and to the Company if we were 
to adopt a policy of hiring all applicants, 


* 


С 


т 


+ Communication Between Management and Workers 75 


HOW LONG DOES IT TAKE TO BECOME A GOOD OPERATOR "А 
sewing machine, like a typewriter, comptometer or calculator, is an 
неме piece of equipment. А student entering business college spends six 
months EI the fundamentals of typing. Speed and proficiency are 
acquired only with practice and experience in a business office. The average 
ур with an aptitude for sewing can learn to operate а machine іп from 2 to 
months and become an experienced and skilled operator in from 4 to 6 


_ months. We are required to make an expenditure of more than $300 for the 


» 


asin 
been 


training for each operator. 


HOW MUCH CAN I EARN AS A POWER SEWING MACHINE OPER- 

R—All persons entering our employ are guaranteed 606 an hour for the 

8 months and 65 an hour thereafter. Our plant average for persons on 

work and having а minimum of 6 months' experience or more is approxi- 

| $1.00 an hour. We do not wish you to assume that you will be paid 

$1.00 an hour at the end of your training period or at the end of the 6 months’ 

period, however, fair piece work rates assure you the opportunity of "M 

E to your ability. We desire to train only those persons interes! 

uous employment. 


WHAT ARE THE ADVANTAGES OVER OTHER TYPES OF WORK 

OPEN TO WOMEN—Persons employed in this industry and by this organi- 

аге assured of permanent employment, at light, clean, interesting work 

most women enjoy. Many industries have openings for only unskilled 

or semi-skilled work, such as food packaging, stock work, and retail selling. 

These positions, for the most part, require very little training and the earning 
ities are therefore definitely limited. 


IS IT POSSIBLE TO DETERMINE WHETHER AN APPLICANT CAN 
BECOME A SUCCESSFUL OPERATOR—It is difficult because of the great 
erences in individuals to be 100% accurate in selecting potentially success- 
х We know that most persons who have normal vision and above 
Average finger dexterity will be successful if they are truly interested, have 


Patience, and will carefully follow the instructions of our training supervisors. 


If you are interested in this employment opportunity, it will be necessary for 
Хм корне the attached Applieation and Supplementary Information 
v. dule ou will then be given an eye test in this office and referred to the 
бы: Employment Service where you will take several short simple tests. 
D^ records indieate that, all other factors being equal, persons who do well 
in these tests usually become skillful highly-paid operators. 


We wish to state in conclusion that if you are selected for employment we 

do everything possible to assure your success. Our instructors will assist 
Qi, ach you during your training period; and they will be constantly avail- 
able for guidance and direction. During the past year many persons have 


"2150160 our employ and have become a permanent and satisfied part of this 


Fia, 1. Copy of original information sheet for potential applicants (Form A).* 


“The reader will notice that the section headings are set in all capitals in Form A, 
the original, but that caps and lower case, bold face are used in Form В, This bas 
= | in order to direct attention to the importance of using the best possible 
another cal arrangement, to promote speed and ease of reading the copy. This is 
таи са; important though frequently neglected aspect of the problem of printed com- 
New EM See Paterson, D. G. and Tinker, M. A. How to make type readable. 
ork: Harper and Brothers, 1940. Рр. 209. (Obtainable from the writers.) 


© 


78 Donald G. Paterson and James J. Jenkins 


were measured assections. Section 6, since Form B was shorter, did not 
run to 100 words so extrapolation was resorted to. 

It will be noted that Form А had been written at the “Нага to Read" 
level of diffieulty appropriate for high school or college students. "There 
is some variability from section to section from “Fairly Hard" to “Very 
Hard." It should be noted that the first section of Form A was written 
at the level characteristic of scientific writing requiring college level at- 
tainment to understand. 

Form B on the other hand was written on an “Easy” level which can 
be comprehended by those with only a fifth grade educational attain- 
ment level. There is also some variability from section to section ranging 
from “Very Easy” to “Fairly Easy.” 

The objection may be raised immediately by some that by presenting 
such simple 4th, 5th and 6th grade level language that the personnel 
department runs a grave risk of insulting the intelligence of the applicants 
most of whom have probably had some high school training. One need 
only re-read Form B to realize that simplification of language need not 
result in “childish” writing nor is there anything in the information as 
given or as it is stated that would insult the intelligence of anyone. 
Bright people can quickly understand Form B. But dull people or even 
“average” people cannot understand or at least will have great difficulty 
in understanding text material suitable for college students. While we 
believe that Form A deserves the label “high brow,” we do not believe 
that Form B deserves the label “low brow.” If this be accepted, there 
can be no quarrel with the simplicity of Form B. 

One other point deserves mention. Form B includes many more 
personal pronouns which Flesch insists is a legitimate technique for 


enhancing the interest value of any type of writing whether it be “Very 
Нага” or “Very Easy." 


Summary 


1. The literature of industrial relations and personnel work has either 
ignored the problem of communication between management and workers 
or has failed to emphasize the importance of communicating in language 
that the “average man” can understand and has given no hints as to the 
techniques of readability measurement that can be applied. 

2. The literature of advertising, however, is in marked contrast. 
Here the necessity of putting advertising copy into simple language has 
been emphasized and ways and means of doing so have been stressed. 

3. Although Kitson in 1921 and W. 8. Gray in 1935 developed ways 
and means of testing the readability of printed prose, it remained for 
Flesch to develop and “put across” a tedious but accurate method of 


Communication Between Management and Workers 79 


readability measurement. His method can now be applied to commu- 
nications between management and workers as well as in many other 
fields. 

4. А sample study was made of an information sheet for potential 
applicants in a needle trades factory. Its Flesch readability score 
classified it as “Нага,” typical of an academie type magazine and re- 
quiring а high school or some college level of reading ability to compre- 
hend it. А simpler form was prepared, preserving the ideas, and the 
resulting Flesch readability score was found to be “Easy,” typical of 
pulp fiction writing and comprehensible to those with a 5th grade level 
of reading ability. 

5. The rules set forth by Kitson (10), Thorndike (23),- Horn (9), and 


` Flesch (6) for making printed copy easy to understand and at the same 


time interesting should be followed by those who wish to prepare copy 
that will be readily understood by applicants for factory jobs. These 
are: (а) Use short sentences; (b) use words of one or two syllables; (с) use 
simple sentence structure; (d) use words in frequent use by consulting 
word lists; (e) avoid unnecessary adjectives and (f) make the message 
personal by using personal pronouns. 

6. The literature on industrial relations and personnel psychology 
should, henceforth, not only place greater emphasis on the importance of 
communication but should devote much more attention to the “how-to- 
do-it” aspect of insuring communications that will be readable to the 
man in the shop and will rate high in interest value. 

Received November 29, 1947. 


Early publication. 
References 
1. Burtt, Н. E, The psychology of advertising. New York: Houghton Mifflin Со, 
1988. Pp. 473. 


2. Burtt, Н. E, Principles of employment psychology. Revised Edition. New York: 
, Harper апа Brothers, 1942. Рр. 568. d 
Filipetti, С. Industrial management in transition. Chicago: Richard D. Irwin, 

Inc. 1946. Pp. 311. 
» Flesch, R. Estimating the comprehension difficulty of magazine articles. J. gen. 
Psychol., 1943, 28, 63-80. 
5. Flesch, R. Marks of readable style. New York: Columbia University, Teachers 
College, Contributions to education, No. 897, 1943. Pp. 69. 
Flesch, R. The art of plain talk. New York: Harper and Brothers, 1946. Рр. 210. 
Gardner, B. B. Human relations in industry. Chicago: Richard D. Irwin, Тас, 
1945. Pp. 307. 
‚ Gray, W. S., and Leary, B. E. What makes a book readable, with special reference to 
adults with limited reading ability: An initial study. Chicago: University of 
S Chicago Press, 1935. Pp. 358. : 
eron, А. R. Sharing information with employees. Stanford: Stanford University 
Press, 1942, Pp, 204. 


Фә 


“чә 


80 Donald G. Paterson and James J. Jenkins 


9. Horn, E. A basic writing vocabulary, 10,000 words most commonly used in. writing, 
Iowa City: University of Iowa, College of Education, Monographs in Education, 
First Series, No. 4. April, 1926. 

10. Kitson, H. D. The mind of the buyer. New York: Macmillan Co., 1921. Pp. 211. 

ll. Maier, N. К. F. Psychology in industry. Boston: Houghton Mifflin Co., 1946. 
Pp. 463. 

12. Mayo, E. The social problems of an industrial civilization. Andover, Mass.: The 
Andover Press, 1945. Pp. 150. 

13. Oakley, C. А. Men at work. London: Hodder and Stoughton, Ltd. 1946. Pp., 
301. 

14. Ogburn, У. F. Social change with respect to culture and original nature. New 
York: The Viking Press, Inc., 1922. Рр. 365. 

15. Pigors, P., and Myers, C. A. Personnel administration. А point of view and a 
method. New York: McGraw-Hill Book Co., 1947. Pp. 553. 

16. Poffenberger, A. T. Psychology in advertising. Second Edition. New York: 
McGraw-Hill, 1932. Pp. 634. е 

17. Poffenberger, A. T. Principles of applied psychology. New York: D. Appleton- 
Century Co., 1942. Pp. 655. 

18. Powell, L., and Schild, Н. W. How to prepare and publish an employee manual. 
"Third Edition. New York: American Management Association, 1947. Pp. 35. 

19. Roethlisberger, F. J. Management and morale. Cambridge: Harvard University 
Press, 1944, Pp. 194. 

20. Roethlisberger, Е. J., and Dickson, W. J. Management and the worker. Cam- 
bridge: Harvard University Press, 1939. Pp. 615. 

21. Scott, W. D., Clothier, R. C., Mathewson, 8. B., and Spriegel, W. R. Personnel 
management, Third Edition. New York: McGraw-Hill Book Co., 1941. Pp. 
589. 

22. Smith, B. L., Lasswell, H. D., and Casey, R. D. Propaganda, communication, and 
public opinion. А comprehensive reference guide. Princeton: Princeton Univer- 
sity Press, 1946. Pp. 435. 

23. Thorndike, E. L. А teacher's word book of the twenty thousand words found том 
frequently and widely in general reading for children and young people. New 
d Teachers College Bureau of Publications, Columbia University, 1981. 

р. 182. 

24. Tiffin, J. Industrial psychology. Second Edition. New York: Prentice-Hall, Тас, 
1947. Pp. 553. 

25. Watkins, G. S., and Dodd, P. А. The management of labor relations. New York: 
MeGraw-Hill Book Company, Inc., 1938. Pp. 780. 

26. Yoder, D. Personnel management and industrial relations. Revised Edition. New 
York: Prentice-Hall, Inc., 1942. Pp. 848. 


College Students! Opinions of Radio Advertising 


Gove P. Laybourn and H. P. Longstaff 
University of Minnesota 


One important measure of the effectiveness of a particular kind of 
advertising is consumer acceptance of that advertising. Since radio 
advertising has grown to such importance, it was thought that the opin- 
ions of а group of students studying advertising might be informative. 
Thus the purpose of this study was to discover the likes and dislikes of 
а group of college students for various types of radio advertising. 

The subjects were two hundred University of Minnesota Juniors and 
Seniors enrolled in the course, Psychology of Advertising. The majority 
of these students were advertising majors. Admittedly this was a highly 
selected group, nevertheless they represent not only consumers of their 
particular class but also the future leaders in the advertising profession 
and as such their opinions should be significant. 

These students were asked to submit examples of radio advertising 
they liked best and the kind they disliked most. They were also in- 
structed to give reasons supporting both their likes and dislikes. Some 
students submitted several advertisements which represented their likes 
and dislikes, but in order to maintain comparableness of the responses 
of all the subjects, only those advertisements which were given first 
choice were tabulated. 

The data were tabulated so as to indicate the total number of re- 
Sponses favorable or unfavorable to each advertisement and these figures 
Were converted into percentages. These data are presented in Tables 1 
through 5. Tables 1 and 2 show the specific radio advertisements which 
were selected as the best liked and the most disliked, respectively. In 
Table 3 the general classes of radio advertisements which were selected 
as the best liked and the most disliked are given. Table 4 yields а com- 
Parison of the best liked and the most disliked commercials in that class 
of radio advertisements which was both best liked and most: disliked. 
In Table 5 the reasons for liking and disliking radio advertisements as 
determined by this study and by a study made by the National Opinion 
Research Center are listed. 

_ Referring to Table 1, it may be seen that the first four advertisements 
ted comprise the best liked of 42.5 per cent of the subjects. The others 
81 : 


82 Gove P. Laybourn and Н. P. Longstaff 


Table 1 
Best Liked Radio Advertisements Submitted by 200 College Students 


Program Usi Num- Pw 
jv irem | Sponsor Feature Liked ber е 


Fibber McGee and Molly Johnson Wax Fibber's kidding of announcer 30 


Jack Benny Lucky Strike Jack's frustration by quartet 26 
Bob Hope Pepsodent “Poor Miriam" song 16 
Station-break spot United Fruit "I'm Chiquita Banana" 13 
Sunday Evening Hour Ford Informative institutional talk 8 
Supper Club Chesterfield “ABC” theme song 7 
Hollywood Players Cresta Blanca “C-r-e-s-t-a B-l-a-n-c-a" theme 6 
Station-break spot Virginia Dare “Say It Again, Virginia Dare” 5 
Station-break spot Wildroot “Wildroot Cream Oil Charlie” 5 
Henry Morgan Eversharp Morgan’s ribbing of sponsor 5 
N. Y. Philharmonic U.S.Rubber Brief commercialatstartandend 5 
Hit Parade Lucky Strike Auctioneer and repeated slogans 5 
Family Hour Prudential ^ Informative talk on insurance 4 
Bob Hawk Camel “C-a-m-e-l-s” and *Lemac" songs 3 
Information Please Parker Brief and direct commercial 3 
Telephone Hour Bell Interesting and informative talk 3 
Cavalcade of Sports Gillette Commercial at sport-cast break 3 
Joan Davis Swan Commercial in dialogue by cast 3 
Blondie Colgate Commercial in dialogue by cast 2 
Merideth Wilson Canada Dry “Canada Dry Water” themesong 2 
Jack Carson Campbell “Mmm-mmm, Good” theme song 2 
Others 34 
No likes 10 
200 1004 


are liked by relatively much smaller numbers of subjects. Five per cent 
of the subjects state that they know of no particular radio advertisement 
which they like. 

Referring to Table 2, it may be seen that the radio advertise 
which is most disliked by the greatest number of subjects is the “О 
Fifty Cents Down on a Lay Ву” skit, a station-break spot sponsored b; 
the Northwestern Woolen Company. r 

It is interesting to note that this commercial, which was selected b; 
more subjects than any other in this investigation and which provoket 
more unfavorable comment than all the others combined, failed to impre 
the name of its sponsor on almost one-third of the subjects who selecte 
it. While three of these subjects admitted that they could not rec 
the name of the sponsor, the rest offered such variations as the followin: 
Minnesota Woolen Mills, North Side Woolen Company, North 8 \ 
Woolen Mills, North West Woolen Company, Northern State 
Northern Woolens, and Northwest Wool Company. 


e 


College Students’ Opinions of Radio Advertising 83 


Table 2 
Most Disliked Radio Advertisements Submitted by 200 College Students 

Program Usi. Num- Per 

| о пина { Sponsor Feature Disliked ber Cent 
Station-break spot N.W. Woolen “Only 504 Down ona Lay By"skit 39 195 
Hit Parade Lucky Strike “LS/MFT,” auctioneer, and slogans 25 125 
Jack Smith Oxydol “Oooh, That Oxydol Sparkle" 13 65 
Station-break spot Balm Barr “I Use Balm Barr Lotion” 12 6.0 
Station-break spot Whiz “Whiz, Best Nickel Candy” 10 5.0 
Station-break spot Lifebuoy “B. O.” foghorn effect 7 3.5 
Red Skelton Raleigh “Proof Positive” slogans 7 3.5 
Cavalcade of Sports Gillette Commercials interrupt sport-cast 6 30 
Station-break spot Super Suds “Super Suds, Super Suds” & 25 
Station-break spot Duz “D-U-Z Does Everything" 5 2.5 
Station-break spot Virginia Dare “Say It Again, Virginia Dare” 5 25 
Btation-break spot Wildroot “Wildroot Cream Oil Charlie" 5 2.5 
Station-break spot — Ex-Lax “Strike the Happy Medium" 4 ^ 
Station-break spot Pepsi Cola “Pepsi Cola Hits the Spot” 8 15 
Station-break Spot Grape Nuts “Grape Nuts Flakes Are Good" 3 1.5 
Station-break spot Rinso "Rinso White, Rinso White" 3 15 
Cedric Adams Taystee Commercials interrupt news-cast 3 15 
Radio Theater Lux Endorsements by guest stars 8 15 
Station-break spot Prell “P_R-E-L-L, Prell Shampoo” 2 10 
Bob Hawk Camel Medical claims 2 1.0 
Screen Guild Lady Esther Lady Esther’s whispering voice 4-40 
Others 36 180 
No dislikes 0 0.0 
200 100.0 


The first five commercials comprise the most disliked radio advertise- 
ments of 49.5 per cent of the subjects. The others dwindle in unpopu- 
larity very rapidly. 

It is rather interesting to observe that, while five per cent of the 
subjects are unable to mention a radio advertisement which they like, 
hone of the subjects are unable to mention a radio advertisement which 
they dislike. 

In Table 3, the best liked and most disliked radio advertisements are 
grouped together according to class or type of commercial. Examples 

each class of advertisement are given under each main class or type. 

At first glance, it may seem rather paradoxical that the type of com- 
Mercial which 18 best liked by the greatest number of subjects should 
also be most disliked by the greatest number of subjects. This phenom- 
“non may be understood by comparing the best liked and most disliked 
Commercials of this type. Such a comparison is made in Table 4. 


Gove P. Laybourn and Н. P. Longstaff 


84 


се 


Сө 


oss 


quo Ра 


vul. 


~ 


ол 


ом 


0'001 002 
8942781 ON os or әйт ON 
s]01242u0;) fo sodfi ], 49110 ог 9 вүюзәләшшог) fo вәйл,, 220 
"ода 'szaKv]q 7939 ‘пон Arusa jenuepnaq 
pmp 42010 qs ÁpwT ‘mop Sumeagq Avpung род 
souo, тоиоцошя 40 трцигрунод $8 и expo], бизвәләуи] 40 aaryousofuy 
Ut papuas? вүоләшшог) бмзүзтгу SD pojusso4q $]0194914010/) зпоызху 
7039 ‘MOUS x4vH чоң ous "оде 'osvo[q uorwurioju под 19419 
IVN, отр} xn ‘отшошіецича “XN QY '8 `1 
шарр) PUD вјигшовлори 1001рэуү 40 $6 61 wpibosg fo way 2Ч) шолј paynbaibag 
врлиош во], била] 8]10124911140,) $]01949UULO,) 12941] pun ода шалу 


"ода “әти, SMIN pog әәўзАъү, 
'sy10dg jo әрвәүвлзгу MAMO 


‘oye '&uuag your OMIS AxonT 
‘STOW pus әәгуәрү леда xv A, uosuqof 


wpaboaq- fo вә syn бизгојзоји] 9°08 £L 1809 fiq andoq ojus poypaodoou] 


$101242u:140;) ћубиот 20 quonboa 
"ода ‘suBBoys * оопопопе , ALIW/S'T» 
101048 „UMO $09 00), по[оо AM ^ M "№ 


вутолоштиор зпозошти 40 43491) 


7939 ,'suvusg BUNDO, PNIA PAUN 
‘Buos , uer 1004, уперовао 


“op 'supDo]s 'sojDut р во pauasosg $28 9L ‘oa 'eupDojs: 'so]Du1  $D pajuoso4q 
81012421140;) бизрәру-чотиәру вуполошшор било моди 
S880 рәх зору 5 учор од ‘ON sossu[.) poxrT 894 


jueurosrjroApy OPLA Jo ESV рәхце! ISON ров рәт 399 
5 9145 


College Students’ Opinions of Radio Advertising 85 


Table 4 


rison of Best Liked and Most Disliked Commercials in that Class of Radio 
Advertisements which was Both Best Liked and Most Disliked 


Best Liked . Most Disliked 

etting Commercials ————— —————— 

d as Jingles, Slogans, etc. No. Per Cent No. Per Cent 
Woolen “Only 50¢ Down" skit 0 0.0 39 19.5 
У auctioneer, slogans 5 2.5 25 12.5 
‘That Oxydol Sparkle” 1 0.5 13 6.5 
Im Barr Lotion” 0 0.0 12 6.0 
Best Nickel Candy” 0 0.0 10 5.0 
“В. О.” foghorn effect 0 0.0 7 3.5 
“Proof Positive" slogan 0 0.0 7 3.5 
Again, Virginia Dare" 5 2.5 5 25 
Cream Oil Charlie" 5 2.5 5 2.5 
ds, Super Suds" 1 0.5 5 * 25 
"Does Everything” 1 0.5 5. 2.5 
x "Strike the Happy Medium" 0 0.0 4 2.0 
Cola Hits the Spot" 1 0.5 3 1.5 
Nuts Flakes Are Good" 1 0.5 3 15 
80 White, Rinso White” 1 0.5 3 15 
L-L, Prell Shampoo" 0 0.0 2 1.0 
ed Fruit “Chiquita Banana” 13 6.5 1 0.5 
dent “Poor Miriam" song 16 8.0 0 0.0 
беја “АВС” theme 7 35 0 0.0 
СӘ Т-А B-L-A-N-C-A" 6 3.0 0 0.0 
s” and “Lemac” songs 3 15 0 0.0 
Dry Water" theme 2 10 о 0.0 
“Mmm-mmm, Good” theme 2 10 0 0.0 
Р 5 2.5 21 10.5 
75 37.5 170 85.0 


inspection of this table reveals that an attention-getting com- 
М which is selected by a certain number of subjects as being their 
‘tiked radio advertisement will not be selected by а comparable 
er of subjects as being their most disliked radio advertisement, and 
ersa. For example, the Northwestern Woolen Company “Only 
ts Down on a Lay By" skit was selected by 19.5 per cent of the 
4s their most disliked radio advertisement; not a single subject 
"ed this commercial as his best liked radio advertisement. The 
де 'cials used on the Lucky Strike Hit Parade were selected by 12.5 
cent of the subjects as their most disliked radio advertisements; 
5 ^9 per cent—one-fifth as many—selected these commercials as 
; liked radio advertisements. 
US, а commercial which is best liked by many is most disliked by 


86 Gove P. Laybourn and Н. Р. Longstaff 


few, and vice versa. In other words, the subjects are specific in their 
likes and dislikes: while some may be more favorably or less favorably 
disposed toward the general class of attention-getting commercials than 
others, most have some specific attention-getting commercials which 
they like and others which they dislike. 

In Table 5 а comparison is made of the reasons for liking and disliking 
radio advertisements as determined by this study and by a study made 
by the National Opinion Research Center. 


Table 5 


Percentage of Reasons for Liking and Disliking Radio Advertisements as Determined 
by this Study (N = 200) and by the N. О. В. C. Study (N = 2246) 
T M—MM——————————————————D 


This Study N. O. R. C. Study 

Reasons for Liking and Disliking ит 

, Commercials: Liked Disliked Liked Disliked 

Bios ca PII TIR О, ГАА ДГ ДЫЛЫ ы А р.н I НИ 

Identifying slogans or sound effects 3% 42% 1% 3% 

Singing or rhyming commercials 24 28 5 п 
Variety уз. monotony 4 22 ЕЈ 7 
Clever humor уз. silly humor 31 17 8 4 
Voice or manner of speaker or singer? 2 16 3 2 
Fits program vs. interrupts program 34 13 10 2 
Dignity vs. poor taste 8 10 2 4 
Instructive vs. useless 6 9 4 * 
Brevity vs. lengthiness 12 6 10 6 
Unbiased vs. biased 2 6 2 4 
Miscellaneous 1 3 5 4 


UD gs Wy FP reU OUR AEn а тү pepe IAE E S SERT Ds 
* More than one reason per subject was possible. 


? In N. O. В. С. study this eategory included only voice or manner of announcer. 
* Less than half of one per cent. 


The differences in relative and absolute magnitude of the percentages 
obtained in these two studies may be attributed to two factors. 

The first factor, which possibly affected the differences in relative 
magnitude of the percentages to a greater extent than the absolute, con- 
cerns the composition of the groups of subjects involved. In this study 
the subjects were selected from an extremely narrow segment of the 
population: this group is markedly homogeneous as to age, intelligence, 
education, socio-economic status, and so forth. On the other hand, in 
the N. О. К. С. study the respondents were selected in such a way as t0 
represent an accurate cross-section of the general population. 

The second factor, which probably affected the differences in absolute 
magnitude of the percentages to a greater extent than the relative, con- 


1 Lazarsfeld, Р. F., and Field, Н. The people look at radio. Chapel Hill: The Uni- 
versity of North Carolina Press, 1946. 


College Students’ Opinions of Radio Advertising 87 


cerns the collection of the data. In this study each subject was directed 
to state which radio advertisement he liked the best and which he dis- 
liked the most and to give the reasons for his choices. In the N. O. R. C. 
study each respondent was asked the following questions: 1. “Сап you 
give me an example of what you think is the best advertising you've heard 
on the radio?" and 2. “Сап you give me an example of what you think 
is the worst advertising you've heard on the radio?" If “Yes”: “What 
didn't you like about it?" О? the 2246 respondents, only 43 per cent 
could give examples of what they liked and only 39 per cent could give 
examples of what they disliked. "Therefore, the percentage of the total 
number of respondents giving reasons for their likes and dislikes is neces- 
sarily much smaller in the N. O. R. C. study than in this study. 


Summary and Conclusions 


l. The best liked classes of radio advertisements are the attention- 
getting commercial presented as a jingle, slogan, skit, and so forth and 
the clever or humorous commercial incorporated into the dialogue by 
the cast. 

2. The most disliked class of radio advertisements is the attention- 
getting commercial presented as a jingle, slogan, skit, and so forth. 

3. An attention-getting commercial which is selected by а certain 
number of subjects as their best liked radio advertisement will not be 
selected by а comparable number of subjects as their most disliked radio 
advertisement, and vise versa. While many attention-getting commer- 
tials are generally disliked, certain others are generally liked. 

4. The principal reasons given for liking radio advertisements are the 
following: (1) fits program, (2) clever humor, (3) singing or rhyming 
commercials, and (4) brevity. 

5. The principal reasons given for disliking radio advertisements are 
the following: (1) identifying slogans or sound effects, (2) singing or 
"hyming commercials, (3) monotony, (4) silly humor, and (5) voice or 
manner of speaker or singer. 

6. In view of the above it seems that great care should be exercised in 
Composing radio advertising and that it should then be carefully pre- 
tested. It would be difficult indeed to justify some of the more highly 
disliked radio advertisements. 

Received April 1, 1947. 


Opinion Polling with Mark-Sensed Punch Cards 


N. L. Gage and H. H. Remmers 
Purdue University 


Ever since psychologists began studying individual differences, thej 
have been plagued with the difficulties of handling large masses of data. 
The greater the number of cases studied, and the more intensive th 
kinds of statistical analyses attempted, the greater have been the burden 
of clerical labor to be carried in the search for valid generalizations. 

One early step in the direction of easing this burden was the devel- 
opment of objectively scorable, or short-answer, tests whose scoring 
could be turned over to clerks. But with the development of large scale, 
periodic testing programs, it soon became evident that further ма was 
desirable to reduce the work in scoring thousands of test booklets. To 
meet this need, many kinds of scoring devices were developed of whicl 
the most successful and widely used has probably been the International 
Business Machines Test Scoring Machine (IBM). With these devices, 
the rapid and accurate scoring of large numbers of tests became much 
more economical and efficient. By-products of these devices were 
more extensive utilization of item analysis techniques and also, of mo! 
dubious value, a greater emphasis on the multiple-choice type of {е 
item. , 

For one kind of appraisal of psychological data, however, the test 


of tabulating these frequencies in relationship both to other single ques 
tions and to various types of personal data. This is the kind of analysis 
that is most frequently made of the data secured by opinion polling and 


The trends in opinion polling toward depth interviewing, the clinical 
approach, open-end questioning, and other more intensive and subjective 
techniques, may become accentuated. But it is probable that for some 
time to come opinion polling will i 
the responses to which are restri 
Furthermore, 


Opinion Polling with Mark-Sensed Punch Cards 89 


higher order breakdowns that are necessary for unequivocal investigations 
of the relative importance of determinants of opinion, they will need 
samples consigting of much larger numbers of cases than they have 
hitherto required. Thirdly, in some kinds of polling situations, such as 
that in which the Purdue Opinion Poll for Young People finds itself, it is 
necessary to secure samples in the form of intact groups, say whole class- 
rooms or schools, which require larger numbers of cases for a given level 
of reliability than do samples that are not composed of intact groups. 
All these considerations point to the desirability of further aids in the 
analysis of data. The purpose of this paper is to describe the use for 
such purposes of the mark-sensing attachment to the IBM Reproducing 
Punch. 

What the mark-sensing attachment does is to convert pencil marks 
on an IBM card to punched holes in the card at а speed of 100 cards 
per minute. During the past year, the Purdue Opinion Poll for Young 
People has conducted three polls by means of the mark-sensing technique. 
The following details of the procedure are pertinent here. 


1. When used for mark-sensing, the IBM card is divided into twenty- 
seven columns, each covering three of the columns in which holes can be 
punched. The pencil mark must be made with an electrographic pencil 
зо as to cover at least two of the punching columns. An electrical impulse 
conducted by the pencil mark is then passed through an electronic am- 
plifying unit to activate the mechanism that punches the hole in the card. 
Ten mark-sensing brushes are available so that ten mark-sensing columns 
can be punched with one passage of the card through the machine. By 
Means of a flexible wiring arrangement the hole can be punched from any 
mark-sensing column to any other column on the card including the 
column in which the mark is placed. Usually, the machine is wired so 
as to punch the hole in the same column in which the mark has been made. 

2. In order to make the same card form usable in many different 
Polls or rating schemes, the card was designed so that each of the 324 
marking spaces was uniquely numbered. The various response alter- 
natives furnished with the questions of opinion and personal data were 
assigned numbers to correspond with certain spaces on the card. The 
Pupil is instructed to read the question and choose the answer that applies 
to himself, notice the number of that answer, find the space with that 
eT on the card and fill in that space with a heavy mark of the special 

neil. 
| 8. Тће mark-sensing device will punch any or all of the twelve spaces 
1 a given column with one passage of the card through the machine. It 
18 consequently possible to exploit the possibilities of multiple-punching 
а much greater extent than is feasible with manual key-punching. 


90 М. L. Gage and Н. Н. Remmers 


That is, although in manual punching it is inconvenient to punch more 
than a single item in any given column, it is as easy with mark-sense 
punching to secure twelve punches in a column as it is to secure a single 
punch. Since each of the questions on the Poll had three response alter- 
natives, it was possible to design the coding so that four three-choice 
questions were coded for a single column. Such multiple punching had 
distinct advantages when it came to counting the responses to the ques- 
tions. By means of the IBM counting sorter, the responses to four 
three-choice questions could be counted simultaneously with one passage 
of the cards through the sorter. The number of pupils choosing each of 
the three answers to the four questions was then recorded by hand from 
the Veeder counters on the sorter. Since the sorter handles 400 cards per 
minute and yields twelve frequencies per run, the use of multiple-punching 
was four times more efficient than if a single-punching code had been used. 

4. Multiple-punching also had advantages in the process of securing 
scatter diagrams for inter-item correlation. The sorter was used to di- 
vide the pupils into three groups according to the three responses to а 
given question. These three groups were then sent through the sorting 
machine separately and the numbers in each group who had chosen each 
of the three responses to the four questions in a given column were counted 
simultaneously. The multiple-punching in effect reduced the number 
of runs required for securing intercorrelations to one-fourth the number 
otherwise necessary. 7 

5. In performing a scale analysis by the Cornell technique as described 
by Guttman,! the IBM tabulating machine equipped with digit selectors 
also proved much more efficient by virtue of the multiple-punching code. 
Preliminary weights were assigned to each of the response alternatives. 
Then the cards were assigned total scores by using each card as a window 
stencil placed over another IBM card marked as a scoring key. The 
total scores were then punched in the card, and the cards were sorted to 
place them in descending order of total scores. When the cards were 
sent through the tabulator, the responses to four items at a time were 
listed. Inspection for unidimensionality and reproducibility could then 
be carried out in accordance with the Cornell technique.? 

Because of the novelty of the method, it was important to ascertain the 
amount of error resulting from pupils’ failures to comprehend and follow 
the directions for recording responses on the cards. In the situations 
under which these polls were administered, by teachers located in widely 
scattered schools and instructed only by means of a sheet of directions, 

1 Guttman, L, The Co i i i i 1 
B iioc Madres eer v A and intensity analysis. Educational 


2 Gage, N. L. Scaling and factorial design in opinion poll analysis, Lafayette, Ind: 
Purdue University, Studies in Attitudes, X, 1947, 84 M ysis. fayette, 


б 


Opinion Polling with Mark-Sensed Punch Cards 91 


it was found that between one and two per cent of the marks had been 
made improperly. It is anticipated that with improved directions to 
pupils and with increasing familiarity of pupils with this method, this 
error can be further reduced. j 

Further applications of the mark-sensing technique can readily be 
made. In general, wherever interest is centered not on a total score 
based on a number of questions but rather on the frequency of various 
kinds of responses, the mark-sensing technique may have great advan- 
tages in the analysis of large numbers of cases. Among the other appli- 
cations that have already been considered are the summarization of 
ratings of a single person by a large number of other persons, as when 
classroom teachers are rated by their pupils. Polls carried out by inter- 
viewers can also be analyzed in this way if the interviewers record the 
responses of interviewees on mark-sensing cards rather than on tally 
sheets. The scaling of items by the Thurstone equal-appearing inter- 
vals technique can be done by having the judges mark their ratings of 
т on cards rather than by having them place the items in various 
piles. 


Summary 


In summary, then, we have described the application of mark-sensed 
punch сата technique to the recording and analysis of large masses of 
psychological data. Application of the technique to questionnaire studies 
as in the Purdue Opinion Poll for Young People has proved successful in 
terms of speed and accuracy of tabulation, and in terms of increased 
efficiency in securing item intercorrelations and performing scale analyses. 
In general, the method will bear investigation of its usefulness in a wide 
variety of psychological investigations. . 
Received May 8, 1947. 


News Notes 


The name “Committee on Aviation Psychology" has been substituted 
for “Comittee on Selection and Training of Aircraft Pilots” to designate 
a committee of the Division of Anthropology and Psychology, National 
Research Council, which has conducted research in the field of aviation 
psychology since 1939. As in the past, the work of the Committee is 
supported with funds allotted by the Civil Aeronautics Administration, 
although steps have recently been initiated to undertake research for 
other Government agencies. 

The Committee has from its very beginning conducted research in- 
volving the maintenance as well as the selection and training of aircraft 
pilots. Among the more than 70 reports published by the Committee 
in the Technical Series, Division of Research, Civil Aeronautics Admin- 
istration, there are many which are concerned with the psychological 
aspects of fatigue, accidents, air sickness, etc. 

The current research program of the Committee on Aviation Psycho- 
logy is largely concerned with air transport pilots and, most particularly, 
with human factors in airplane accidents. The program includes research 
on the selection, upgrading, and certification of pilots; studies of stall 
warning devices; psychological aspects of instrumentation; as well as 
investigation on methods of training civilian and commercial pilots. 
Such research is being conducted through The Ohio State University, 
. the American Institute for Research, the Educational Research Corpora- 
tion, and other agencies. The Executive Subcommittee will be pleased 
to consider proposals for grants-in-aid to research personnel working in 
universities and other institutions who are interested in carrying out in- 
vestigations in the field of aviation psychology under the auspices of the 
Committee. Proposals should be submitted to Dr. M. S. Viteles, Chair- 
man, National Research Council Committee on Aviation Psychology; 
University of Pennsylvania, Philadelphia, Penna. 


. The present membership of the Committee on Aviation Psychology 
includes: 


Brigadier General Milton У. Arnold, Vice President—Operations and 
Engineering, Air Transport Assn. of America; *Comdr. Norman L. Barr 
Division of Aviation Medicine, Bureau of Medicine and Surgery, U. 5. Navy; 
*Dr. George K. Bennett (ex officio), President, Baycholorionl Corporation; 
*Dr. Dean R. Brimhall, Assistant to the Administrator for Research, Civi 
Aeronautics Administration; *Dr. Paul M. Fitts, Chief, Psychology Branch, 
Aero Medical Laboratory, Wright Field; *Dr. Frank A. Geldard, Professor 0 


^ 


.News Notes 93 


Psychology, University of Virginia; Captain B. Groesbeck, Division of Avia- 
tion Medicine, Bureau of Medicine and Surgery, Navy Department; Major 
General Malcolm C. Grow, Air Surgeon, Army Air Forces; George E. Hadda- 
way, Editor, Southern Flight, Air Review Publishing Corporation; *Dr. А. I. 
Hallowell (ex officio), Chairman, Division of жср А and Psychology; 
Professor of Anthropology, University of Pennsylvania; Dr. J. G. Jenkins, 
Professor of Psychology; Head, Department of Psychology, University of 
Maryland; *Captain Wilbur E. Kellum, U. S. Navy, School of Aviation Medi- 
cine, Pensacola, Florida; Dr. Peter C. Kronfeld, Associate Professor of Ophthal- 
mology and Director of Education, Illinois Eye and Ear Infirmary, University 
of Illinois; Jerome Lederer, Chief Engineer, Aero Insurance Underwriters; 
*Dr. Donald B. Lindsley, Professor of Psychology, Northwestern University; 
Dr. W. В. Miles, Professor of Psychology, School of Medicine, Yale University; 
Dr. C. L. Shartle, Professor of Psychology, The Ohio State University; *Lt. 
Col. Anthony C. Tucker, Office of the Air Surgeon, Headquarters, Army Air 
Forces; and *Dr. M. S. Viteles, Chairman, Committee on Aviation Psychology; 
Professor of Psychology, University of Pennsylvania. 


Personnel Psychology, a Journal of Practical Research, has announced 
its first issue for January, 1948. It will publish papers related to re- 
search in the personnel field, particularly in selection, placement and 
evaluation, training, job analysis, classification and compensation, em- 
ployee relations, motivation and morale, industrial design, and work 
conditions. Readability is to be combined with scientific excellence as 
4 criterion. 

Personnel Psychology will be edited by G. Frederick Kuder, Duke 
University, with М. W. Richardson, Richardson, Bellows, Henry and 
Co. ; George K. Bennett, Psychological Corporation; and W. V. D. 
Bingham as advisors on editorial policy. 

Manuscripts and editorial communications should be sent to Dr. 
Frederick Kuder. If sent before February 1, 1948, they should be ad- 
dressed to 306 Edmonds Building, 915 Fifteenth Street, Washington 5, - 

After February 1, 1948, Dr. Kuder's address will be Department 
of Psychology, Duke University, Durham, North Carolina. 

Personnel Psychology will be issued quarterly, beginning January, 
1048, The yearly volume will consist of 400-500 pages. The sub- 
scription rate is $6.00 per volume, Foreign, $7.00 per volume, single 
copies $2.00. Subscriptions and business communications should be sent 
to Dr. Erwin K. Taylor, 1727 Harvard Street, N. W., Washington 9, D. C. 


* 
Member of Executive Subcommittee. 


Book Reviews 


Maier, Norman R. F. Psychology in industry: а psychological approach 
to industrial problems. Boston: Houghton Mifflin, 1946. Pp. xvi + 
463. $3.00. 


This kind of book has been long overdue. It is a systematic but non- 
technical presentation of the applications of psychological principles 
and techniques to the functioning of industrial organizations. As such, 
it is “primarily intended for the many who are concerned with human 
problems in industry and who themselves are not industrial psycho- 
logists.” It should find wide use in college courses aimed at acquainting 
industry’s potential executives and technicians with psychology’s signifi- 
cance for their future vocational activities and responsibilities. 

Dr. Maier places strong emphasis throughout his book on the dyna- 
mics of human behavior in industry, specifically on matters of motivation, 
attitudes, and morale. Five of the nineteen chapters are devoted ex- 
clusively to problems of attitudes, frustration, morale, and motivation. 
The systematic viewpoint taken is that “frustration and motivation 
represent two distinct sources of action in the individual” and that “‘to 
understand the behavior at any particular time we must know which 
mechanism is dominating." The chapter on “Frustration as a Factor in 
Forming Attitudes and Developing Social Movements” is of especial 
significance. However, that significance might be even clearer if the 
detailed discussion of motivation preceded the discussion of frustration 
instead of following several chapters later, 

Two introductory chapters discuss the scientific study of behavior 
and the concept of causation in behavior. Other chapters cover major 
topics in the field of industrial psychology in a thorough but non-technical 
way. ‘These topics include the measurement of proficiency, the nature 
and use of tests, motion-and-time study, training and learning, fatigue, 
accidents, working conditions, and labor turnover. The basic frame of 
reference—attitudes, morale, and motivation—is considered in relation 
to each of these topics. Thus each technique and principle is seen against 
the background of a living, functioning industrial organization. The 
concluding chapter reviews and points up the significance of each major 
topic for, first, the first-line supervisor; then, the employee counselor; 
and finally, higher levels of management. Following the last chapter 18 
a chapter by chapter bibliography. 

94 


Book Reviews 95 


The book contains a considerable amount of material that is relatively 
new to the field of industrial psychology. This material includes discus- 
sions of the level of aspiration, of resistance to interruption of a task, 
and of the application of sociometry to industrial situations, and prelimi- 
nary reports of experiments conducted by Alex Bavelas on democratic 
techniques of supervision in industry. 

There are a few serious omissions. The concept and problem of 
reliability are not considered at all in connection with either merit rating 
or tests. This concept is no more technical than others included in the 
book, and certainly it is of major importance. Although there is an ex- 
cellent treatment of training and learning in industry, there is no mention 
of the Training Within Industry program. In spite of the great emphasis 
on morale, the morale of first-line supervisors, which is currently a very 
live issue, is barely mentioned. The suggestion in the final chapter that 
higher management doesn’t need as high a level of supervisory skills as 
first-line supervisors would be difficult to defend. 

This reviewer’s knowledge of and experience with first-line supervisors 
in several kinds of business and industrial organizations convinces him 
that the book is much better adapted to college students and college- 
trained technicians and supervisors than to the average first-line super- 
visor in industry. It is pitched at too high a level of comprehension for 
а large proportion of industry’s present first-line supervisors. And it 
is written too much in the style of the typical college text to have strong 
appeal for such supervisors. 

Several experiments with animals are cited as illustrations of various 
mechanisms. To expect the average personnel manager, industrial engi- 
heer, or supervisor to believe that such experiments have significance 
for his work seems to be a bit naive. Practically no efforts are made in 
the book to suggest why animal experiments are relevant. 

For an introductory college course on psychology’s applications for 
business and industry Psychology in industry is an excellent text. А non- 
technical book on industrial psychology that will capture the interest; of 
industry’s first-line supervisors without sacrificing accuracy is still to 
be written. 

Kenneth A. Millard 

Macalester College, 

St. Paul, Minnesota 


Gray, J. Stanley, et.al. Psychology in human affairs. New York, 
McGraw-Hill, 1946. Pp. viii + 646. $3.75. 
The reader may recall a similar book edited by the same author in 
1941 with contributions by a dozen psychologists. The present work 
18 Not a revision and, in fact, the earlier one is not mentioned. However, 


96 Book Reviews 


they do cover much the same ground. Gray does several chapters him- 
self and with one exception the collaborators are different Írom those 
who participated in the previous enterprise. Most of the present col- 
laborators are less well known but, on the whole, the work is better in- 
tegrated. Perhaps the editor was in a better position to direct and 
modify their contributions and perhaps there was some advantage in the 
fact that many of them were geographieally accessible. At any rate, 
there is less apparent discontinuity than in the average book by a con- 
glomeration of authors. "There are still a few important omissions 
(infra) presumably where one contributor expected another to cover the 
point, and there are apparent differences in style, but neither of these 
shortcomings is too serious. After reading the first six chapters the 
reviewer wished that Gray had kept right on and done all of them. 

The rationale back of the arrangement of the chapters is not entirely 
clear. While business and industrial problems are discussed in sequence, 
the chapters on adjustment, mental illness, and clinical practice are not 
contiguous. However, most of the material is there anyway, which is 
the important thing. 'The major fields of education, business, and 
medicine in their psychological implications are treated adequately, with 
the field of criminal and legal psychology slighted somewhat. "There are 
also chapters on public opinion, music, art and literature, and military 
psychology. : 

The book is quite definitely а textbook, presumably aimed at about 
the second course after the student has a general background in psycho- 
logy. The treatment is commendably factual, citing experiments, and 
with extensive annotation. This would make it a bit heavy for the 
layman but is altogether appropriate in a text. 

Any reviewer is more favorably impressed by some parts of a book 
than by others. In the present case, vocational guidance seems to be 
Gray’s best chapter and the one on public opinion is among the best from 
the contributing authors. The discussion of psychology in crime appears 
to be the least satisfactory and contains rather little psychology. 

There are a few serious omissions. There is nothing on scientific 
crime detection, a field in which the public is rather misinformed and in 
which students might profitably be set straight. There is nothing about 
the psychology of testimony. Another major omission is morale in in- 
dustry and the role of psychological factors in supervision. Morale is 
mentioned in the military discussion but not in the industrial. In em- 
ployment psychology, there is all too brief a presentation of employment 
tests, with а disproportionately extensive treatment of interviews and 
. personal data. Tests constitute psychology's greatest contribution in 
this field. There are some other places where there is not an actual 
omission but where the discussion might profitably be extended a little. 


О 


Пе ~ о 


Д О 


Book Reviews 97 


Points which impressed the reviewer in this respect are how to study, 
personality tests, psychotherapy, and clinical techniques. 

On the other hand, there are some points where the treatment is more 
extensive than is usually the case in texts on applied psychology. Speech 
correction gets a whole chapter, and the psychological effects of drugs has 
а fairly detailed treatment. Neither of these detracts at all from the 
value of the work, but if it were a question of condensing something in 
order to include some of the omissions mentioned above, these topics 
might be considered. 

The question of statistics is always a moot one in a book on applied 
psychology. It is not quite clear whether it is assumed that the student 
will have had some of the fundamentals before tackling this particular 
book. It practically assumes that he knows something about correlation 
coefficients because there is only a brief footnote regarding them when 
they first appear in the discussion. It is well for the student to get the 
notion that psychology is a quantitative science. The text will give 
him that much, but would be better if, in addition, he learned a little 
more about interpreting quantitative data in terms of variability, correla- 
tions, and probability concepts. 

The book covers the ground about as well as any text could be ex- 
pected to do in view of the terrific size of the field of applied psychology 
at the present time. It presents a very considerable amount of source 
material, citing facts in support of a great many of the points brought out 
and implementing the discussion with numerous tables and graphs. The 
annotations and footnotes would be helpful to anyone interested in run- 
ning down detailed material. While the work is not actually a source 
book, it borders on it in spots. From the teaching standpoint, the out- 
lines at the beginning of each chapter will be helpful, and the suggested 
Teadings at the ends of the chapters are well chosen. If a student gets 
through this book under a teacher who can find his way around in the 


field, he will certainly know something about applied psychology. 
Harold E. Burtt 
Ohio State University 


Robinson, Francis P. Effective study. New York: Harper and Brothers, 

1946. Pp. ix + 262. $3.00. 

This extensive revision of the author’s previous (1941) Diagnostic 
and remedial techniques for effective study is a manual or workbook written 
tobe used by students enrolled in how-to-study courses. It could be used 
With equal profit in many so-called college orientation courses. The 
book represents a coordination of the diagnostic and training devices 
and principles which have been demonstrated to improve scholarship at 


98 Book Reviews 


Ohio State University. It is designed to “free the instructor so that he 
may become a counselor rather than a task master." 

` The first part of the book describes “Higher Level Work Skills"; 
these are traditional how-to-study topics and principles treated in non- 
traditional fashion. The motivated student to whom the logic of its 
presentation appeals should benefit greatly from using the “tricks of the 
trade” which receive detailed treatment in this section. Remedial read- 
ing and effective language usage are described, and various exercises 
provided, in the seeond part. In the last part problem areas familiar 
to the student personnel worker receive considerably more attention than 
is customary in books designed to teach study skills. 

Throughout the book are included charts, graphs, rating scales, 
check lists of various sorts, and other devices by which original status 
and progress throughout a course may be known to the student. The 
motivational value of knowledge of progress is wisely exploited by the 
author. 

Objective tests cover fields ranging from knowledge of social usage 
or etiquette, through dictionary and library usage, to principles of mathe- 
matical formulation. Norms for (presumably Ohio State) freshmen and 
keys are included in the book. The diagnostic tests provide the student 
in a how-to-study course with opportunities to evaluate his skills and 
adjustments to college. It is on the basis of data made available through 
the use of the book that the student, under the guidance of the instructor, 
develops study skills and adapts them to courses currently being studied. 

The author's appeals are frankly made to serious students of all levels 
of ability who appreciate the logic of doing а job right. This is not a book 
to be given as a “going away to college" present to a freshman. Its 
effective use demands guidance by a competent instructor, who, in ad- 
dition, realizes it does not provide short cut methods for so-called group 
counseling. If used in a college having various agencies for remedial 
work and counseling to which the instructor could refer some of his stu- 
dents, the book will prove all the more valuable. 

Use of the book for the purposes for which it was written commits 
the instructor selecting it as a work manual to a plan of action which 
may, indeed, reorient the instructor and school as well as the students. 


. ФА. ken 
The State University of Iowa е 


Blankenship, Albert В. (ed.). How to conduct consumer and opinion re- 


search. The sampling survey in operation. New York: Harper & 
Brothers, 1946. Рр. xi+ 314. $4.00. 


Dr. Blankenship has induced twenty-five busy research practitioners 
(nine of them psychologists) to describe important examples of their work. 


Book Reviews 99 


The resulting chapters constitute the best available collection of material 
on applications of questionnaire and interview research to business uses. 
The contributions focus predominantly on market research; only a small 
part of the volume deals with public opinion polling and with attitude 
research for government. 

Emphasis is placed on the practical importance of the research de- 
scribed. The aim, the editor states, is to help businessmen determine 
what services they should utilize in this field and to acquaint students 
with research applications of the “sampling survey” which lie beyond 
"elementary knowledge of techniques." 

The twenty-two chapters apart from the introduction may be grouped 
as follows: eight are on market research, including studies of product 
improvement, brand acceptance, dealer performance, and advertising; 
five are on radio research, principally audience measurement for com- 
mercial purposes; two deal with magazine reading; two with employee 
attitudes and public relations; four are on governmental use of sampling 
Surveys and one is on public opinion polls. 

A few of the more general chapters cover the purposes of marketing 
studies and the steps in questionnaire research, methods used in testing 
advertisements and in developing advertising copy, public opinion polling 
activities and the newer procedures in sampling. Other chapters sketch 
More specific examples of applied opinion work. The reader learns how 
tadio audience size is measured in the famous Hooperatings; he is told 
the methods for counting magazine readers in Life’s elaborate “Continu- 
ing Study of Magazine Audiences”; he sees the Psychological Corpora- 
tion’s “brand barometer” and the Industrial Surveys Company’s use of 
extensive consumer panels to determine people’s changing brand buying 
behavior and its relation to their magazine reading and radio listening; 
he looks at practical illustrations of studies to improve products and ad- 
Vertisements; he is introduced to the techniques of employee attitude 
Measurement, research on magazine editorial problems, and detailed 
analysis of radio programs. Government survey work is represented by 
brief reports of studies by the Office of War Information, the Office of 
Civilian Requirements, the Bureau of the Census, and the Division of 
Program Surveys in the Department of Agriculture (a total of 44 pages 
Гог all of these). 

. The treatment throughout is severely practical and empirical. There 
= almost, по reference to guiding theories or use of hypotheses—no sug- 
8estion that sound theory may also contribute valuably to applied re- 
Search. There is likewise little critical discussion of techniques or com- 
Parative evaluation of methods. Problems of analysis and interpreta- 


tion of findings are especially slighted. And research purposes are dealt 


. 


100 Book Reviews 


with somewhat idealistically, as if the selection of problems and pro- 
cedures were always aimed at true problem solving and never at proving 
а case or securing sales ammunition. In short, the book is less a rounded 
and balanced treatise on research methods than it is an account of suc- 
cessful research services. 

As such, it will be extremely useful in providing students of marketing 
and opinion research with an up-to-date view (if not quite complete and 
realistic) of the practical work being done. There has been remarkably 
little writing of this type available. Consequently the book fills a 
genuine need. Those who are working and teaching in this field owe a 
definite vote of thanks to the editor and contributors. 


Arthur W. Kornhauser 
Wayne University 


Maslow, Paul. The analysis and control of human experiences. The in- 

dividual seen through the Rorschach. Vol. I and II. Brooklyn, 

New York: Paul Maslow, 1946 and 1947. Рр. 195 and 229. $3.50 
each. 


The study of “Rorschach Psychology” differs from other scientific 
study, according to the author, in that “. . . the study of the personality 
structure is different because it cannot be seen, felt, smelled, tasted or 
touched and actually does not exist. ‘It’ is invisible, nonmaterial and 
without extension and weight. . . . The Rorschach operates by materi- 
alizing the non-existent and measuring this ‘substance’ to transform the 
complete personality structure into an absolute and, in some instances, 
grim reality” (Vol. I, p. 2). 

Accepting the Rorschach in the unqualified fashion illustrated above 
the first volume deals mainly with a series of short essays (50 chapters) 
in which a great variety of personality types and terms are analyzed in 
the language of the Rorschach. The author places a great deal of re- 
liance upon the specific meaning of single determinants. 

The second volume (58 chapters) consists largely of a series of essays 
on science, epistemology, religion, politics, ethics, therapy and a variety 
of miscellaneous topics. These are illustrated by means of Rorschach 
terms or clarified by knowledge of personality types which have been 
analyzed through the Rorschach. 

The following quotation describing part of the author’s technique of 
Rorschach analysis illustrates the style of the book “. . , Note the 
rhythm of the record, the arrangement of details, the attraction-repulsion 
patterns, the quality of thestimulation and the varying pitches of intensity. 
Watch for holes, dead-ends, contrasts, . . . brilliant bursts, persistent 
triviality, massive complexity, depth, width, laxness, unity. . . .” (Vol 
II, p. 14). No referents for these descriptive phrases are provided. 


| 
| 
| 
| 


Book Reviews 101 


o volumes have no systematic organization. No data are 
and it is not clear to what extent the series of speculations are 

the clinical experience or reading of the author. To those 
the Rorschach as uncritically as the author does these volumes 
ontain some interesting speculations, to others they will not be 
) reading. The two volumes are mimeographed and paper 


Julian B. Rotter 


New Books, Monographs, and Pamphlets 


Books, monographs, and pamphlets for listing and possible review should be sent to 
Donald G. Paterson, Editor, Department of Psychology, University 
of Minnesota, Minneapolis 14, Minnesota 

Measurement of consumer interest. С. West Churchman, Russell L. 
Ackoff, and Murray Wax, Editors. Philadelphia: University of 
Pennsylvania Press, 1947. Рр. 214. $3.50. 

Dvorine color discrimination screening test. Israel Dvorine. Baltimore: 
Israel Dvorine, 1947. Рр. 10. $2.50. 

The incidence of neurosis among factory workers. Russell Fraser, et al. 
Report No. 90. Industrial Health Research Board, Medical Re- 
search Council. London, England: H. M. Stationery Office, 1947. 
Pp. 66. 1s. 3d. 

Study your way through school. C. d'A. Gerken. Chicago: Science Re- 
search Associates, 1947, Pp. 48. 

Psychotherapy in child guidance. Gordon Hamilton. New York: Col- 
umbia University Press, 1947. Pp. 340. $4.00. 

Sociometry of leadership. Helen J. Jennings. New York: Beacon House, 
1947. Рр. 28. $1.50. 

Rehabilitation of the physically handicapped. Henry H. Kessler. New 
York: Columbia University Press, 1947. Pp. 274. $3.50. 

The relationship between content of an adult intelligence test and intelligence 
lest score as a function of аде. Rose E. Kushner. New York: Bureau 
p Publications, Teachers College, Columbia University, 1947. Pp. 59. 

.10. 

Mits, wits, and logic. Lillian R. Lieber. New York: W. W. Norton and 
Co., Inc., 1947. Рр. 240. $3.00. 

The story of hypnotism. Robert W. Marks. New York: Prentice-Hall, 
Inc., 1947. Pp. 246. $3.00. 

Psychological testing. James L. Mursell. New York: Longmans, Green 

~ and Co., Inc., 1947. Pp.449. $4.00. 

Readings in social psychology. Theodore M. Newcomb, et al. New 
York: Henry Holt and Co., 1947. Рр. 672. $3.85. | 

Abnormal psychology. James D. Page. New York: McGraw-Hill Book 
Co., Inc., 1947. Pp.441. $4.00. 

Personnel administration. Paul Pigors and Charles A. Myers. New 
York: McGraw-Hill Book Co., Inc., 1947. Pp. 553. $4.50. 

102 


New Books, Monographs, and Pamphlets 103 


Foundations for American education. Harold Rugg. Yonkers-on-Hud- 
son, New York: World Book Company, 1947. Рр. 826. 
The selected writings of Benjamin Rush. Dagobert D. Runes, Editor. 
New York: The Philosophical Library, 1947. Pp. 433. $5.00. 
Introduction [0 methods in experimental psychology. Second Edition. 
Miles A. Tinker, New York: D. Appleton Century, Inc., 1947. Рр. 
232. $4.00. 
The thematic apperception test. Silvan S. Tomkins. New York: Grune 
and Stratton, 1947. Pp. 297. $5.00. 
The reduction of intergroup tensions: a survey of research on problems of 
. ethnic, racial, and religious group relations. Robin M. Williams. 
New York: Social Science Research Council, 1947. Pp. 153. $1.75. 
So you want to help people. Rudolph M. Wittenberg. New York: Asso- 
ciation Press, 1947. Рр. 174. $3.00. 
.. Preparing an employee handbook. London, England: Institute of Person- 
nel Management, 1947. Рр. 36. 2s. 6d. 


Journal of Applied Psychology 


Vol. 32, No. 2 April, 1948 


The Ninety-Fourth Issue of the Psychological Barometer 
and a Note on Its Fifteenth Anniversary 


Henry C. Link 
The Psychological Corporation, New York City 


By the time this is published, the Psychological Barometer will have 
passed its fifteenth anniversary. It was in March, 1932, that fifteen 
Psychologists, cooperating with The Psychological Corporation, made a 
Survey by means of 1578 personal interviews in as many homes in fifteen 
Cities from coast to coast. The psychologists volunteered their time and 
that of their students in making this survey. The report of this survey 
in the Harvard Business Review of January, 1933 (4), gives complete de- 
tails as to methods and results. Additional psychologists cooperated in 
several enlarged surveys which helped to establish the Psychological Ba- 
Tometer on a firm, self-supporting basis. The second report was made 
in the Journal of Applied Psychology in February, 1934 (5). 

From the outset, the Barometer has been used as a measure of public 
Opinion, of public attitudes, of ability to identify certain advertising 
themes, of public behavior, especially buying, reading and radio listening 
habits. ‘The first survey reported the development of the triple associates 
test for measuring an important aspect of advertising effectiveness (4). 
This test, illustrated by such a question as this: What coffee advertises: 
Look for the Date on ће Can”? is now one of the chief supports of the 
tometer. It has also become a standard test in the fields of adver- 
tising, politics and propaganda generally. ' 
1 Probably the earliest continuous publie opinion poll made entirely 
with persona] interviews is recorded in the following results from six 
Sychological Barometer surveys (11), shown at top of page 106. 
ince November, 1937, in two ten thousand interview Barometer 
Surveys each year, the favorable or unfavorable attitudes toward eight 
o the country’s leading companies are ascertained. These eight com- 
р anies, who also underwrite this service, make extensive use of it in plan- 
Б and in measuring the effects of their public relations activities. 
815 the oldest continuous series of public attitude or public relations 


105 


. 


106 Henry C. Link 


Question: From what you have seen of the National Recovery Act in your п 7 
hood, do you believe it is working well? 


% % % 

Yes 48 41 55 50 38 

Хо 27 30 22 23 26 

Uncertain 25 29 23 27 36 
Total Interviews 1932 2386 3076 5167 4000 3710 


surveys in the field. In 1944 another semi-annual attitude Index 
another group of companies was begun. 

The use of these Barometers in measuring people's brand buy 
habits, described elsewhere, continues. Some day, as a result of tl 
attitude and behavior measures, we may be able to make inte esl 
contributions to the problem of the relationship between attitudes | 
behavior. The fact that these surveys are made four times a year ¥ 
10,000 interviews and twice a year with 5000 interviews provides ап! 
sually broad base for research of this kind. 

Other aspects of these Barometer surveys are described in the 
lications listed at the end of this report. 

Probably the chief limitation of the Psychological Barometer is 
it has always been based on an urban sample. This has made it imp 
sible to use these surveys for predicting elections, the feature by whi 
the Gallup and Fortune surveys have become best known. | 

А source of great regret to the writer is the fact that surveys like t 
Barometer, the Gallup, Crossley, Roper and other polls are not mi 
widely recognized as the peculiar instruments of psychological re 
Every so often people who learn that І am a psychologist ask me wi 
psychologist does. When I tell them that one of my chief activities i81 


them that psychology is defined as the study of behavior or habits, 
surveys are scientific attempts to measure the behavior and habits 
representative samples of people. 
It seems a pity that so many people, including college graduates, ht 
so little understanding of psychology and its major fields. Public opin 
polls, for instance, are identified with Dr. Gallup far more frequen 
than with psychology, even though Gallup is a psychologist. 


^ 


Ninety-Fourth Issue of the Psychological Barometer 107 


identified with Fortune and other magazines, with the fields of journalism, 
politics, political science, but hardly at all with psychology. And yet, 
historically, methodologically and in almost every other way, public opin- 
ion polls and behavior surveys are peculiarly the field of individual and 
social psychology. 

Psychologists themselves, rather than the public or other disciplines, 
are responsible for this situation. What has happened here is much like 
that which has happened in the fields of motion study and personal coun- 
seling or clinical psychology, to mention only two. Motion study in 
this country was taken over by the engineers and efficiency experts. 
Clinical psychology has been increasingly taken over by the psychiatrists. 
The very name, clinical psychology, implies the field of crisis therapy, the 
medical беја. Instead of distinguishing its own unique approach, espe- 
cially in the great field of near-normal behavior problems, the very use 
of such terms as clinical, therapy, and others has helped to push what 
should be psychological problems into the field of medicine. A few psy- 
chologists, including Cyril Burt in England, have recently begun to pro- 
test against this trend now so powerfully established. 

It is sometimes said: What difference does it make whether a given 
field falls into one discipline or another, so long as it is being adequately 
handled? The question is: Are these fields being adequately handled by 
the other disciplines? Or even, can they be? So far as vocational, edu- 
cational and certain emotional problems are concerned, many psycholo- 
gists would answer with a positive no. They would agree that the pre- 
dominantly abnormal or psychiatric approach to many of these problems 
can not take the place of the approach of a psychology predominantly 
interested in the more normal. 

In recent years psychologists have made considerable progress in 
catching up with the procession in the field of public opinion and be- 
havior Surveys. To be sure, the prize plum of recent years is the study of 
of sex, being made by the zoologist, Alfred 8. Kinsey, under a substantial 
grant from the Rockefeller Foundation (3). Since its beginning about 
eight years ago, personal interviews have been made with over 12,000 
People and 100,000 interviews are planned over a period of years. The 
techniques used are those which psychologists, among all the professions, 

ave probably done most to develop—the interviewing technique, the 
cAnical case study, the questionnaire, the standardized inventory, the 
Statistical treatment of such data, and sampling methods. 
he name, Psychological Barometer, was chosen in the first place be- 
pue of our convietion that the techniques involved stemmed directly 
Тош the basic concepts and methods of psychology, both personal and 
gg Тће Gallup and the Roper or Fortune polls are psychological 
arometerg just as much as is our poll. However, even though Gallup 


108 Henry C. Link 


is a psychologist and Roper has made good use of psychologists, 
polls are not generally regarded as integral aspects of psychology. | 
lup's contributions to this беја are known to the initiate, and his book 
The Pulse of Democracy (Gallup, George and Rae, Saul Forbes. New. 
York: Simon & Schuster, 1940), in the writer's opinion, marks the be- 
ginning of a new era in Social Psychology. And yet, it was only after he 
had achieved wide public recognition that some psychologists began to. 
take him seriously. 

When we first began the Psychological Barometer, some psychologist 
even questioned its right to the title, psychological. This was pa 
because our surveys dealt with such commercial matters as buying h 
and advertising. Once more, the traditional distinction between pi 
research and applied research worked to the disadvantage of psycholo- 
gists. It hindered their grasping quickly the great possibilities of 81 
surveys not only for pure research but for applied research in the у 
ranges of social and personal psychology. ) 

During the past decade there has been a great increase in the number 
of psychologists interested in survey techniques. However, most of 
this interest has been concentrated in the field of public opinion and po- 
litical polls. Many psychologists still fail to see that, from the stand 
point of progress in techniques, advertising and market surveys have far 


individual may buy. 

In political terms, the scope of economic democracy, or democracy 6 
the market place, far exceeds that of political democracy. In scienti 
terms, market research offers far greater possibilities than does poli 
research. The latter can validate its findings only once a year, the form 
every day in the year. The former has one behavior standard for valid 


Ninety-Fourth Issue ој the Psychological Barometer 109 


Barometer surveys in their courses in social and applied psychology, their 
new textbooks and their own research projects. 


The 94th Psychological Barometer 


The current survey was made with 5000 interviews during October, 
1947, by 380 interviewers under the supervision of 144 psychologists in 
147 cities and towns. It represents a true cross-section of the urban 
population. Two questionnaires were used, one with one-half the sample, 
or 2500 people, the other questionnaire with the other half of the 5000 
people. These two sub-samples were comparable by geographic, sex, 
socio-economic, and other criteria. Each question in this study was made 
with one or the other of these 2500 samples. 

The October survey included several questions on the lighter aspects 
of American life as well as a series on the political issues of the day. That 
is, the questions dealt not only with such topics as ownership of pets, 
musical instruments, and the extent and means of travel but also with 
People’s opinions on the present labor laws, the cost of living and the 
Prospects of peace. 

Sampling Method. A modified area sampling method was used. All 
Interviews were assigned by the local supervising psychologist by blocks 
and streets in accordance with maps constructed to designate the proper 
Socio-economic levels. These maps are made to divide the population 
into four principal groups: the “A” group consisting primarily of owners 
and executives; the “B” group, primarily white-collar and semi-profes- 
sional; the «C group or skilled factory and transportation workers; “D” 
Stoup, or the less skilled. About 28 per cent of the sample are union 
members, All interviews were made in the home, but only one in a 
family; half were made with women, half with men. 


The Ownership of Pets 


In the October Psychological Barometer, a question asked in October 
8 was repeated in order to measure trends in the ownership of pets. 
© results for the two studies are given below. 


Oct. Oct. 
Answers 1938 1947 

Dog 27% 27% 
Cat 18 13 
Canary, parrot, or other bird 8 3 
Fish 2 1 
Others 2 2 


Total Interviews 5623 2500 


10 Henry C. Link 


Thus, while the percentage owning dogs and cats has not changed at 
at all, canaries, parrots and other birds are owned by fewer urban families 
today than in 1938. The detailed results show that pets are owned just 
as frequently by the poor or middle class families as by the more affluent. 


The Playing of Musical Instruments 


Q. “Does anybody in your family play a musical instrument, and if so, who plays 
and what?” 


Socio-Economic Groups 
Answers Total A B с р 

"Total families in which musi- 

cal instruments are played 42% 56% 51% 38% 31% 
1 member of family plays 28 27 32 27 23 
2 members of family play 10 20 12 8 7 
3 members of family play 3 5 5 2 1 
4 members of family play 1 4 2 1 = 
Total families in which no 

members play 58 44 49 62 69 


Total Interviews 2500 250 750 1000 500 


* Less than .5%. 


Kinds of Instruments Played 
_————Є—Є————Є—————————Є—Є 
Sex 
3 Instruments "Total Men Women 

Cue ei LU Ыса SS LI Eae] cci oi МД rrr с ==, 

Piano 10% 42% 88% 

Violin T 10 5 

Clarinet 4 9 1 

Guitar 3 7 1 

Trumpet 3 7 1 

Saxophone 3 6 1 

Drum 2 5 x 

Accordion 2 3 1 

Harmonica 1 1 y 

Cello * 1 he 

Miscellaneous 9 17 4 


* Less than .5%. 


Note: The total per cents in the above table add to more than 100 because some 
people play more than one instrument. 


" 


| 
| 
| 


Ninety-Fourth Issue of the Psychological Barometer 111 


A Question of Chivalry 


Q. “Do you think a man should stand up and give a woman his seat in а bus, train 
or street car?” 


EEE, ,, 


Socio-Economic Groups Sex 
Answers Total A B С р Men Women 
Yes, give up seat 42%, 36% 37% 42% 51% 42% 41% 
No 13 13 13 13 10 12 14 
Depends 45 51 50 45 39 46 45 


————— MH паге =. tS Sr 


| While men and women agree quite closely on this subject, the people of 
higher education and wealth are less likely to believe in this form of chiv- 
alry than do those of less education and wealth. 


Extent and Means of Travel 
Q. “How many trips of 500 miles or more one way did you make since last October?” 


a а єл 


Socio-Economic Groups 
Answers Total A B [e] D 
ee ee 
None 60% 36% 51% 65% 75% 
1 trip 22 25 25 21 18 
2 trips 8 17 1 7 4 
3 trips 3 6 4 2 1 
4 trips 2 4 3 1 1 
5 trips 1 4 1 1 ы 
Over 5 trips 3 8 4 „78 ~ 
Don't know 1 *. 1 1 1 


ER 0 77 ТУЫ и E A 


* Less than .5%, 
Q. “How did you go, by train, air, bus, or auto?" 


Socio-Economic Groups 


Answers Total TS а D. 
= МУЫ соте гиз КОРИ ey 


Auto 49 46% 49% 54% 41% 
Train ж 32 30 27 33 
Air 12 18 15 7 5 
Bus 5 2 3 8 9 
Ship 1 * e * 2 

4 10 


Don't know 4 2 3 
E orc Wr, e TT 


112 Henry C. Link 


Q. "И the cost were the same, how would you go next time, by train, air, bus, or 
auto?" 


Socio-Economic Groups 

Answers "Total A B с р 
Auto 46% 42% 41% 49% 39% 
Train 20 22 18 17 28 
Air 29 33 31 27 23 
Виз 2 2 1 2 5 
Ship * H баё 1 — 
Don't know 3 + 3 4 5 

* Less than .5%. 


According to these results, more than twice as many people would 
travel by air if the cost were the same than have travelled by air since 
October 1946. Trains would suffer the greatest loss of passengers if the 
costs were the same. 


Whom Do the Labor Laws Favor 


Q. “Do you think that the present laws regulating labor unions favor business, 
favor labor unions, or are fair to both?” 


=————_———_—————_—____— 


Socio-Economic Groups 
Answers Total A B с р 
puru EU ca. 0 о 78 
Are fair to both 42% 4% 42% 41% 39% 
Favor labor unions 22 31 25 22 16 
Favor business 18 12 18 20 19 
Don't know 18 13 15 17 26 


——————— — 
* 
_—————Є————————“ 


Sex Union Membership 

Union Non- 

Answers Men Women Members ^ Union 

м Áreas 
Are fair to both 48% 35% 40% 42% 

Favor labor unions 19 26 15 25 
Favor business 23 13 30 14 
Don’t know 10 26 15 19 


лее  —ч— ра т ЦБ pup 


Ninety-Fourth Issue of the Psychological Barometer 113 


Labor Union Monopolies vs. Company Monoplies 


Q "Which monopolies are more dangerous: monopolies by big companies or monop- 
by big labor unions?" с 


Socio-Economie Groups 
Answers Total A B с р 
Labor union monopolies 3195 43% 33% 82% 22% 
Company monopolies 16 9 16 16 18 
Equally dangerous 42 44 45 41 39 
Don't know 1 4 6 п 21 
ТЕ 
Sex Union Membership 
Union Non- 
Answers Men Women Members Union 
Labor union monopolies 299% 34% 21% 35% 
Company monopolies 16 16 24 18 
Equally dangerous 48 36 43 42 
Don't know 7 14 12 10 


1 


The Cost of Living 


Е" view of the current concern over the high cost of living, we included 
"eral questions on what people thought were the reasons for the high 
Prices, how they thought prices could be reduced, and what they thought 
а return to rationing. 


Q shat do you think are the main reasons for the high prices of food and other 


duc 


_ A classification of the responses to the above question revealed that 
Те people blame the Government or labor unions for high prices today 
an blame any other agencies. 


Socio-Economic Groups Union Membership 


Union Non- 


Total A B с р Members Union 
2975 31% 29% 29% 28% 31% 28% 
28 з 30 з 22 24 29 
16 10 15 17 19 19 14 
10 14 11 9 6 8 10 

9 12 11 8 6 7 10 

3 4 3 3 4 3 Б] 
24 АЗЫ O3 4.01 25 28 | 


12 ^8 10 12 7 п 12 


114 Henry C. Link 


Those people who named the Government as primarily responsible 
gave as the principal reasons: Government shipping goods overseas, 
Government subsidies keeping prices up, Government removal of price 
controls, Government politics, high taxes. Those who named labor gave 
as more specific reasons: the high wages unions were asking, too many 
strikes. 'Тһозе who named manufacturers and business concerns as the 
cause said that companies were making too much money, were holding 
back their products for higher prices, were charging too much for their 
services as middle man. Among the miscellaneous reasons given were: 
aftermath of the war, not enough consumer resistance, increased cost of 
production, speculation. 


Q. “Because of the high prices and shortage of food, do you think we should go 
back to food rationing?” 


Socio-Economic Groups Union Membership 
Union Non- 
Answers Total A B с р Members Union 
No 65% 1% 64% 63% 65% 629) 66% 
Yes 30 26 31 31 27 34 28 
Don't know 5 3 5 6 8 4 6 


DIL oo xc PM ЦРН eee, 


Q. “Do you think prices can best be reduced by more government regulation or 
by fewer government regulations?” 


ж _————————Є———————= 
Socio-Economic Groups Union Membership 
Цел ee 


i Non- 
Answers Total A B с р Ee Union 
e Ru iz A ume 28 pr che od Е. ы га 
By more gov't 


regulations 44% 36% 43% 45% 46% 48% 42% 
By fewer gov’t 

regulations 42 56 46 40 35 38 45 
Don’t know 14 8 11 15 19 14 13 


Changes in Family Prosperity 


Included in the October Psychological Barometer was a question we 
have been asking consistently since October 1941 in order to measure 
trends in people’s beliefs about their living standards. 


Q. “Is your family more prosperous or better off today than two years ago, 1688 
prosperous, or the same?” 


D 


Ninety-Fourth Issue ој the Psychological Barometer 115 


The following table gives the results for the two earliest studies and 
for the last two: 


Oct. Oct. April Oct. 

Answers 1941 1942 1947 1947 

More prosperous 38% 29% 29% 24% 
Тһе вате 47 47 42 46 
Less prosperous 15 21 26 28 
Uncertain — 3 3 2 


The group which seems, to a slight degree, to be best off, according 
toits own answers, is the *B" or white-collar and semi-professional group. 
The opinions of union members are very much like those of non-union 
members. 


= Ee ee EM TAAA EA oe 
n ——Ó—ÓMÀ—MM——————————— RS 


Socio-Economic Groups Union Membership 
Union Non- 
Answers A B e D Members Union 
More prosperous 23% 2% 2495, 20% 2395 25% 
The same 48 46 44 49 44 46 
Less prosperous 26 25 29 28 29 27 
Uncertain 3 2 3 3 4 2 


а СЕ ee a 


Beliefs About Socialism in England 
Q. “Did you know that England has a labor socialist government now?” 
————————————————31 


Socio-Economie Groups 


Answers Total A B с р 
Yes 66% 84% 80% 63% 42% 
Ко 24 10 15 26 39 
Don't know 10 6 5 11 19 


@. "Do you think that socialism in England will succeed or fail?” 
= I III 


Socio-Economic Groups 


Answers Total A B Cc D 
Fail 47% 57% 56% 4% 29% 
Succeed 12 19 13 10 11 
Don't know 41 24 31 42 60 


116 Henry C. Link 
The Diminishing Prospects for Peace 


Since 1943, we have asked in seven different surveys the question: 


Q. "After this war (or, now that the war is over) do you think that we will make а 
peace settlement that will last or do you think that we will have another world war in 
twenty-five years or во?” 

Тће responses to this question indieate that more people expect war 
today than ever in the last five years. 


Feb. Oct. Oct. Oct. Oct. 


Answers 1943 1944 1945 1946 1947 
Will have another war 43% 54% 59% 74% 77% 
Will make a lasting 
peace 47 28 28 18 11 
Don't know 10 18 13 8 12 


The expectation that Russia will be the next enemy has been con- 
stantly increasing. 


Q. “Who do you think will be our next enemy?" 


Country Named 1944 1945 1946 1947 
Поре па B Don WES T od NN А | 
Russia 29% 37% 56% 67% 
Сегшапу 9 2 2 1 
Japan 5 5 1 1 
England 4 3 8 1 
China 1 1 1 1 
Don'tknow . 6 11 11 6 

Total % Expecting War 54 59 74 77 


More than half of those who said in October that they expected war 
said they expected it within 10 years. 


Q. "In how many years do you think it will be?” 


Answers Oct. 1947 
eer ctc Pro Et Vo Vr (vtm TQ Ur па LUPIS Lo bts NNNM NR ~. 

1 year or less 4%, 
2-4 years 10 
5-10 years 37 
11-19 years 12 
20-25 years 8 
Over 25 years or uncertain 6 

Total % Expecting War 77 


Received January 30, 1948. 
Early publication. 


` 


Ninety-Fourth Issue of the Psychological Barometer 117 


References 


Jenkins, J. О. Dependability of psychological brand barometers; I. The problem 
of reliability. J. appl. Psychol., 1938, 22, 1-7. 

Jenkins, J. G., and Corbin, H. H., Jr. Dependability of psychological brand 
barometers; т. The problem of validity. J. appl. Psychol., 1938, 22, 252-260. 

Kinsey, A. 8. Sezual behavior in the human male. Philadelphia: W. B. Saunders 
Co., 1948. 

Link, H. C. A new method for testing advertising effectiveness. Harv. bus. Rev., 
1933, 11, 165-177. 

Link, H.C. A new method for testing advertising and a psychological sales barom- 
eter. J. appl. Psychol., 1934, 18, 1-26. 

Link, H. C. How many interviews are necessary for results of a certain accuracy. 

J. appl. Psychol., 1937, 21, 1-17. 

. Link, H. C., and Lorge, I. The psychological sales barometer. Harv, bus. Rev., 
1985, 13, 193-204. 

Link, H. C. Workers’ reactions to industrial problems іп а war economy. J. appl. 
Psychol., 1942, 26, 416-438. 

Link, Н. C. The eighth nation-wide social experimental survey. J. appl. Psychol., 
1943, 27, 1-11. 

| Link, Н. C. The ninth nation-wide social experimental survey. J. appl. Psychol., 
1944, 28, 1-15. 

Link, Н. С. Some milestones in publie opinion research. J. appl. Psychol., 1947, 
31, 225-234. 

^ The Psychological Corporation. A study of public relations and social attitudes. 

J. appl. Psychol., 1937, 21, 589-602. 


Studies of Job Evaluation. 7. A Factor Analysis of 
Two Point Rating Methods of Job Evaluation 


C. H. Lawshe, Jr., Edmund E. Dudek, and R. F. Wilson 
Division of Applied Psychology, Purdue University 


As job evaluation techniques become more widely and frequently 
used, more and more questions concerning the applicability and effective- 
ness of these systems arise. Some of these questions relate to the types 
of evaluation systems available, the jobs to which these systems are ap- 
plicable, the number of scale items needed for effective evaluation, the 
reliability of scales of different lengths, and the number of separate and 
distinct factors actually involved in these scales. In previous studies 
(S, 6, 7, 8, 9, 10), information has been presented bearing on several of 
these problems. 

In this study an attempt is made to obtain some information concern- 
ing the basic factors involved in two different point-rating scales, viz., 
the NEMA Job Evaluation System (4) and a Simplified Job Evaluation 
System devised by the senior author (10). Specific questions were: 
What are the separate and distinct factors which are operating in these 
two systems? Which factors do the systems have in common and which, 
if any, are specific to one or the other system? And, how great a dis- 
crepancy in factor loadings or weights is there between the two systems 
in the factors which they have in common? Answers to these questions 
will help to indicate to what extent the same factors or elements аге 
evaluated by the two methods. 


Procedure 


The Job Evaluations Systems. A factor analysis was made of the in- 
tercorrelations between ratings of forty jobs made by twenty analysts 
using two job evaluation methods. The NEMA System, as adopted by 
the National Electrical Manufacturers Association, provides for the rating 
of jobs on eleven items in four categories: namely, Skill (Education, Ex- 
perience, and Initiative and Ingenuity), Effort (Physical Demand and 
Mental and Visual Demand), Responsibility (Equipment or Process, 
Material or Product, Safety of Others, Work of Others), and Job Condi- 
tions (Working Conditions, and Unavoidable Hazards). Each item is 
rated on a weighted five-point scale. The Simplified System provides ' 
for ratings on four items: Learning Period, General Schooling, Working 

118 3 


Studies of Job Evaluation. 7 119 


Table 1 
Occupations Rated 
Job USES USES D.O.T. NEMA 
Number Code Job Title Labor Grade 
1 1-23.14 Messenger I 2 
2 1-38.91 Stockroom Man 5 
3 1-38.05 Tool Clerk 3 
4 2-61.08 Watchman 3 
5 2-82.10 Charwoman 2 
6 2-84.10 Janitor I 2 
7 2-86.20 Sweeper 2 
8 2-95.30 Elevator Operator, Freight 3 
9 340.04 Grounds Keeper I 11 
10 4—75.010 Machinist Maintenance 1 
n 4-75.010 Machinist (All Around) : 11 
12 4-76.210 Tool and Die Maker 11 
13 4-78.011 Engine Lathe Operator 7 
14 4-78.042 Horizontal Boring and Milling Machine Opr. 8 
15 4-78.061 Shaper Operator 7 
16 4–78.071 Planer Operator 8 
17 4-78.503 Surface-Grinder Operator 7 
18 4-80.010 Sheet Metal Worker II 8 
19 4-83.100 Boilermaker Maintenance 9 
20 4-85.020 Welder, Arc 7 
21 4—85.030 Acetylene Welder 7 
22 4-86.010 Blacksmith II 10 
23 4-87.010 Heat-Treater 10 
24 4-97.420 Electrical Repairman 10 
25 5-25.830 Carpenter, Maintenance 9 
26 5-27.010 Painter, Maintenance 8 
27 5-29.100 Mason—Plasterer 9 
28 5-30.210 Plumber, Maintenance 9 
29 5-73.010 Electric-Bridge-Crane Operator 4 
30 5-78.100 Millwright 11 
31 5-83.611 Maintenance Man, Building 10 
32 5-84.110 Tool Grinder Operator 8 
33 5-88.020 Rigger III 9 
34 6-77.020 Вићег 5 
35 678.011 Band Saw Operator 5 
36 678.632 Assembler 7 
37 7—10.010 Fireman, Low Pressure 4 
38 9-65.14 Boiler Cleaner 4 
39 9-71.01 Oiler I 4 
40 9-55.02 Automobile Washer 3 


- SS ТИ а ен Le rer ыш 


120 C. Н. Lawshe, Jr., E. E. Dudek, and R. F. Wilson 


Conditions, and Job Hazards. The items provide for rating in five, six, 
or seven defined degrees. А more complete description has been pre- 
viously given by Lawshe and Wilson (10). 

The Job Descriptions Rated. Ratings were made of forty jobs from 
job descriptions which had been adapted from the USES National Job 
Description Series (2, 3). Twenty-four of these jobs were classified by 
the Dictionary of Occupational Titles as skilled, four were semiskilled, three 
unskilled, and the remaining nine as clerical, service, and agricultural. 
A list of these jobs, their corresponding USES code numbers, and NEMA 
labor grades is presented in Table 1. 

The Ratings. Twenty analysts, most of them personnel department 
supervisors, job evaluation supervisors, or job analysts, participated in 
this study. As discussed in a previous article (10) each analyst rated only 
twenty of the forty jobs. There were actually ten independent ratings 
of each job; five by the NEMA System, and five by the Simplified System. 
For each job the item ratings made by the five analysts were averaged; 
thus giving a composite rating on each item and the total for each job. 
The reliabilities of the several items in the NEMA System, determined by 
intercorrelating the ratings of five analysts were previously reported and 
discussed by Lawshe and Wilson (10), as ranging from .72 to .96, and re- 
liability for “Total Points” was .94. The reliabilities of the items in the 
Simplified System, obtained in the same way, ranged from .84 to .97, and 
“Total Points” reliability was .98. These reliabilities are shown in the 
last column of Table 2. 

Intercorrelations and Factor Analysis. The average item ratings for 
the forty jobs were then intercorrelated and the resulting r’s are shown 
in Table 2. This correlation matrix was factor analyzed using the cen- 
troid method as described by Guilford (1). After five factors were ex- 
tracted the process was discontinued when several criteria! indicated that 
additional factors would be the result of chance variance only. The 
rotations were performed using the graphic method described by Guilford 
(1) and the factor loadings before and after rotation are presented in 
Table 3. 

Findings and Interpretations 


Identification of Factors. In Table 4 the scale items are ranked under 
each factor according to size of rotated factor loading (.400 or above)- 
These loadings were used in defining the factors. Loadings on “Total 
Points” were not considered in defining factors. 

1 For example, the sum of the guessed communalities equaled 14.381; the sum of the 
computed communalities after four factors equaled 13.970, after five factors 14.525- 
Using Tucker's f test (11), the limiting value for Ø was determined at .889, and Ø for 
four factors was .859, while for five factors it was .890. The largest cell value in the 
fourth residual matrix was .126, in the fifth .095. 


1 ' 


сз 

— 
86 — тер 86h 796' 296 006' 829 297 TPS СРР 692 66€ 192 69% 216' GIO FIG ъа 1830.1, 21 
ys I£7 — 659° 819' 919 409° 298° 898° ILZ 99V' PLE IZE Ф6Р Фор 6Р9 199' 279 zeH qof 91 
68 £6F 6:9 —  co£& 2087 сор ZIS 708: OFI' 96U cec 602 IPE OA Tee’ 922 595 puop ата 61 
10: коб #19 968 — 088 126 ZS 080° 229° ТОР TOL 189° 96L 270 986° FFE FEG dog "ume FT 
96° 296° 919° 206 088 — 208° 009° 610° 99h 99° 282 909° 229° OL0 188° 278 188 pg под eI 

pegudung — ^ 

Y6' 006" "од Zor 126 208 — 289° Sez 962 909° O6L 809° 182 S8& 896 C90 Раб а їз, CI 

= сг 809°. 298 CIS ZS 209 699 — 0987 66 £99 СТР Lah 26h 988 099° 999° 229 вре 

A ојавртолвид TI 

= og вог 808 FOS" 080° 610 322 098° — 690° 110° 800'— 890° 620° 812 F80 £0U' 620° “POD ом OI 

$ ys] MY 126 ОРГ 209 98Р 922 768 690 — Ser 86h 228° 06Р 822 699° 069° 869 JOM 'Фен 6 
98" РО ФЕР 967 РОР 988 SCC 669 ПО 8р — 97 149 8219 6ӨРГ OSh Фор ЛУ ‘aeg 'dsoy 8 

З Lr бог PLE cec FOL 18: 062" SIF 800— S6V 9LY — SP9 11 6Z0'— 808° 108 888 зерр'Чен A 

A gr 60€ тс 602 289 909 809° LZP S90 LLe° 119° eho" — PIS Z90 F9 229 169 'dmby'dsy 9 

З 9и Tor e6F Tc 965 219 T8L' cov 680° 067 #19 ITZ’ PI — 920 965 LLL GEL puswoq 

> “SIA рив оу с 

> 28 692 ФОР 922 272 010 888 086 812 868€ ӨР’ 660—090 90 — LET 876 ШТ wq ‘siqa F 

S 96° 216° 679° I££' 556° 198 826° 099 РВО 699° OGP' 808 +9 962 LET — 096° 6 Aymuozuy 

E рив шр g 

© 96 216' 199° 922' FFG 278 996° 999° SOT’ 069° ФОР 208 289 LLL’ 8727 096° — 66 eouoenodxq g 
6 #16' ср 296" £6 188 60 229 600 869° LIP 888 169 982 2217  976' 666 — чоцвәпря т 

YWAN 

PE 91 ОИС Да а ВА ЖЕЛ E sur 


sway pogrdurg sue VWAN 


suimskg чотузпүзА 40р peyydung pus үр 94} ur вушод Teo], рив вшәўү uooajog suorje[orroo1ojuT puv әріден 
- с 91491 


C. Н. Lawshe, Jr., E. E. Dudek, and R. Р. Wilson 


122 


соот 068° Orc 160° PLE 022" 266" 6РГТ— Orc 190° 65r Spo peyydung—syurog үезор, LI 
#16" 192" 689° eer 210 Scr ra tie gst" sie FIs 00r— 622° SprezeH qof 91 
[20 TOP eer ЕР €60— OIT 516" 908' — 290— 960'— 102— 799' suonrpuo;) Jurpunoimg ст 
£96' rer 90r 62r Вер 67 896° ФРО— ZST 260'— 99c 16 рошәд Zumo FT 
#06 299 967  910— 287 oer 2006 8560 — 06€ Oer 917 LeS Sugooqog euD gr 
pegrdung 
660 688 907 98 — 909 809 066 080 | 990— 69T— 927 896 VAWSUN-—stoq го, CI 
908 8% 999 768272 0658 9/25 664 696 — $40 268 gge- 612 SprezeH e[qupro^vuf) TI 
762" [404 110'— ров" 480 — 00r— 664 166 — $8U— Р— 8, gie впогурпогу даром OT 
999° 900 €00'— 20%" 099" [0312 199 eee" ют — $08-— 061° ср9' 819430) Jo мом — Фен 6 
629 zie 262" 180° 19 220 — 289 682' 00r— Pres" 990° 699" 819010 Jo Алоје — "Фон 8 
928" 219' 010— 920— у Zor FES" TOL — сг— 040 Ле 022 Twuojep—'dsoqg 4 
169" coc" coU c10— ezr 922" 989 99l'— yec-— 98V 1g6 сро’ juoudmb;p—'dsow 9 
сл о 80 +#00°— $99 SIP [642 £10 | 9Il'— Ier 6868 921  pusue(p[enstA риє резпәуү € 
ws"  S90— ©00— 987 99 980 008 Tor’ сг— 207'— 099— LIF puswoq osya Ф 
296 0; +0 9б — 88h 022 696 #20 — 680  сеТ— 967 S26" Аутаодиј puveaneymy g 
#26 8882 — 290 yet 1884 лл 226 920 00  TISDT— 082 086° әопәнәйхӱ g 
946 #98 290 990 989 771 126 620 сәг 6EI'— 9 868' uonwonpg т 
УИЯМ 
id "я e: 'я Я 'я 4 ЭЯ £f p: "Я D: uio 
хоне зору чоно 910joqr 


uonwjoy 1947y риз o1ojog sSurpwo 1030€ 
€ 91481, 


Studies of Job Evaluation. 7 123 


Таћје 4 
Factor Names with Scale Items Arranged in Order of Magnitude of Loadings * 


Factor Item and Loading 
I—Skill Demands (General) 1—Education 774 
14—Learning Period 749 
13— General Schooling 730 
3—Initiative and Ingenuity .730 
2—Experience 717 
7—Resp.— Material „462 
9—Resp.—Work of Others „450 
16—Job Hazards 428 
5—Mental and Visual Demands 413 
II—Supervisory Demands 9—Resp.—Work of Others .650 
8—Resp.—Safety of Others 647 
5—Mental and Visual Demands 563 
2—Experience 581 
3—Initiative and Ingenuity A88 
1—Education „486 
7—Resp.— Material ATA 
14—Learning Period „428 
6—Resp.—Equipment „423 
III—Job Characteristics— 4— Physical Demands 875 
Non-hazardous 10—Working Conditions 854 
15—Surrounding Conditions „843 
16—Job Hazards 433 
IV—Job Characteristics— 16—Job Hazards .689 
Hazardous 11—Unavoidable Hazards .655 
V—Job Responsibility 7—Resp.— Material 617 
] 6—Resp.—Equipment .592 
13—General Schooling .567 
5—Mental and Visual Demands 440 
14—Learning Period 434 
3—Initiative and Ingenuity 420 
15—Surrounding Conditions 404 


Я * Items from NEMA System appear in light face and those from the Simplified 
Ystem are in bold face, Only loadings of .400 or greater are listed. 


“Skill Demands, General.” Factor I appears to be a general intel- 

lectual ability or general skill factor. It has high loadings for items from 

oth systems, namely: from the NEMA, Education .774, Initiative and 

Ingenuity -780, and Experience .717; and from the Simplified, Learning 

| ы :749, and General Schooling .730. It corresponds to the “Skill 

| рея пачы: factor found in previous studies (5, 7, 8, 9) and has 
y designated as such. 


—————M 


124 С. H. Lawshe, Jr., E. E. Dudek, and R. F. Wilson 


“Supervisory Demands.” Factor II appears to be a specific to thi 
NEMA system. The highest loadings for NEMA items were: Respol 
sibility for Work of Others .650, Responsibility for Safety of Others .64 
Mental and Visual Demands .563, and Experience .531. Only one iten 
Learning Period, from the Simplified system appeared in this factor wil 
a medium loading of .428. The two items with the highest loading 
Responsibility for Work of Others and Responsibility for Safety of Oth 
seem to indicate this factor is largely one of supervision. The fact thi 
the other two responsibility items, Responsibility for Equipment or Pro 
ess and Responsibility for Material or Product, did not show up high 
this factor seems to lend additional support to the conclusion that this! 
a Responsibility for People, i.e., Supervision, rather than a “Respons 
bility for one's own work” factor. The appearance of several other ite 
with medium loadings in this factor suggests that it may possibly involy 
general responsibility but, due to the isolation of another responsibilil 
factor (Factor V), it seemed best to define this factor tentatively 8 
"Supervisory Demands." This corresponds to a factor found in a pre 
vious study (7). 

“Job Characteristics—N on-Hazardous." Factor III appears to be 
clear-cut factor pertaining to the physical characteristics of the job without 
regard to skill demands. The high loadings from the NEMA scale wer 
for Physical Demands .875 and Working Conditions .854; for the Sin 
plified scale for Surrounding Conditions .843, and a medium loading fo 
Job Hazards .433. This factor is very similar to factors found in 
vious studies (5, 7, 8, 9) and has been similarly called “Job Character 
istics, Non-Hazardous.” | 

“Job Characteristics—Hazardous.” Factor IV is another rather eat 
cut factor. Only one item from each scale had significant loadings | 
this factor, from the NEMA, Unavoidable Hazards .655, and from thi 
Simplified, Job Hazards .689. This then seems to refer to the hazardous 
conditions involved in the job and it has been named “Job Character 
istics, Hazardous” which is similar to a factor found previously (5). — 

“Job Responsibility.” Factor V appears to be another responsib it 
factor. Items with highest loadings from the NEMA scale were Respo 
sibility for Material or Product .617 and Responsibility for Equipment 0 
Process .592, and from the Simplified scale, General Schooling .50 
Medium loadings were found for Mental and Visual Demands .440. 
Initiative and Ingenuity .420 from the NEMA, and for Learning Р! 
.434 and Surrounding Conditions .404 from the Simplified scale. 
phasis seems to be on responsibility for material things, or for one's € 
work rather than for work of others, or à matter of carefulness instead € 
supervision. It has therefore been designated “Job Responsibility.” _ 


Studies of Job Evaluation. 7 125 


Discussion of Results 


Of the five factors found, Factors I, III, and IV appear to be quite 
clearly defined while Factors II and У are much more general in nature. 
There seems to be some indication, from the items having medium 
loadings in these latter two factors, that these two are composite factors 
containing some item variance which might be drawn out by a sixth 
factor if provision were made for its isolation by inclusion of several other 
items. This factor might be of the nature of “Specific Skill Demands". 

The authors do not intend to imply that the five factors identified here 
would completely account for job evaluation of all types of jobs. There 
&re numerous other occupations in the professional, clerical, and skilled 
ranks which probably could not be adequately evaluated on such a scale. 
However, if it can be assumed that these two job evaluation systems in- 
clude all of the important items necessary for evaluation of jobs for which 
they were designed, it seems that a scale comprising five factors could 
satisfactorily achieve the desired purpose. 

The hypothesis is suggested that for each “family” or group of similar - 
oceupations а core of items designed to measure the five factors found in 
this study may be used in setting up a job evaluation rating system. 

It is possible that another item, specific to the occupations in question, 
should be added to allow for evaluation of any unusual aspects that are 
hot general to all or most occupations. This hypothesis is supported by 
the findings of Lawshe and Satter (5) who found a factor called “Attention 
Demands,” specific to jobs in a plant manufacturing small caliber am- 
munition where many jobs consisted of machine “attending” and visual 
Mspection. This factor was not found in two other plants of a different 
nature where jobs did not require “attention” or "inspection" to as high 
а degree. It is further supported by the findings of Lawshe and Alessi 
(8), in that a factor identified as “Skill Demands—Specific” (in addition 
to a "Skill Demands— General" factor) was found. The results from 
these studies suggest the advisability of including an item specific to the 
Jobs in question as well as items based on the basic factors found in this 
Study but the testing of this hypothesis must be undertaken later. At 
any rate, it appears that by employing a job evaluation system consisting 
of item scales based on the five basic factors found, considerable time and 
effort could be saved in comparison to that involved in using a longer 
System, and, at the same time, as complete an evaluation could be made 
as the longer system permits. 

It should not be inferred, however, that the authors recommend im- 
mediately abolishing presently used scales in favor of a short, five or six 
item scale. It is realized that frequently it is desirable to use items pos- 
essing what might be termed “face validity” in spite of the fact that 


126 С. Н. Lawshe, Jr., E. E. Dudek, and R. F. Wilson 


such items contribute nothing additional to the scale. That is, if super- 
visors and employees believe that a certain element is important, and are 
agreed that it should be included, it may be highly desirable that it be 
included for policy reasons even though it may be statistically shown that 
its contribution is nil. . Care should be taken, though, to determine that 
such an item does not detract from the reliability (and, if possible, from 
the validity) of the scale if it is introduced. 

Comparison of Systems. Figure 1 indicates the relative variance in ` 
the two systems attributable to each of the five factors. The “total 


1 SKILL OEMANOS (GENERAL) к ыша... 
IE SUPERVISORY DEMANDS YY“ ke 


TIL JOB CHARACTERISTICS — NON HAZARDOUS Ка 
IX. JOB CHARACTERISTICS HAZARDOUS ш... 


р! N 15 er 
X RESPONSIBILITY SO 


Fic. 1. Relative contribution of each factor to total variance in the two systems. 
The shaded bars represent the NEMA System and the solid bars represent the Simpli- 
fied System. 


points" factor loadings for the NEMA System and for the Simplified 
System respectively were squared and converted to per cent of h? for total 
points for each factor. It will be observed that Factors I and V carry 
the most weight in the Simplified System and Factors I and II in the 
NEMA. This can be partially accounted for by considering the standard 
deviations of the ratings of the several items in each system. These 
8.D.’s are shown in Table 5. Inasmuch as the S.D.'s for items (1) Edu- 
cation, (2) Experience, and (3) Initiative and Ingenuity in the NEMA 
System are considerably larger than the S.D.'s for the other items, and 
the “Total Points” score was obtained by a simple summation of the 
point ratings on all items, it follows that the factors which included these 
three items would account for a major portion of the variance (А). Simi- 
larly, in the Simplified System, items (13) General Schooling and (14) 
Learning Period, having relatively much larger S.D.'s, would contribute 
most to total points and to the resulting variance. It is not suggested 
that a similar distribution of variance would be found with other systems 
nor that the distribution found here is an optimum. It merely indicates 
the relative weight or importance of these factors taking into account the 
actual weights that were assigned to the items included in each system. 
It is appreciated that the factor loadings depend in part on the rota- 
tions made between the factors found and that different rotations woul 
have resulted in different loadings and therefore different contributions 


Studies of Job Evaluation. 7 127 
Table 5 


Standard Deviations of Ratings on Individual Items 
Approximate 
Item S.D. Ratio* 
NEMA 
1 Education 13.10 4 
2 Experience 23.54 7 
3 Initiative and Ingenuity 14.19 4 
4 Physical Demand 6.46 2 
5 Mental and Visual Demand 3.19 1 
6 Resp.—Equipment 8.25 1 
7 Resp.—Material 3.29 1 
8 Resp.—Safety of Others 3.58 1 
9 Resp.—Work of Others 3.77 1 
10 Working Conditions 7.06 2 
11 Unavoidable Hazards 2.98 1 
Simplified 
13 General Schooling 39.00 4 
14 Learning Period 45.60 5 
15 Surrounding Conditions 8.28 1 
16 Job Hazards 10.60 1 


* The "approximate ratio" is derived by dividing each standard deviation by the 
smallest one in that system. 


the communality. However, in this study several factors (I, III, IV) 
Were quite clear-cut and only one other reasonable rotation, in addition 
to those employed (between Factors II and V), seemed possible. This 
rotation did not seem to permit as significant an interpretation of the 
factors. It is therefore believed that the variance contribution of the 
five factors shown in Figure 1, as based on the weights of individual items 
and the most logical rotations of axes, indicates, at least in close approxi- 
mation, the relative importance of these factors. 

The correlation between the average total point NEMA ratings in the 
forty jobs and the average total point ratings under the Simplified System 
Was .90 (10). This lack of perfect agreement between the two systems 
тау in part be explained by the relative difference in weight or impor- 
tance of the several factors in each system as indicated above. However, 
the statement that there appears to be substantial agreement between the 
two systems seems to be adequately supported. Inasmuch as neither of 
the systems is perfectly reliable, there is reason to believe the “true” 
Correlation between them is higher than the observed r of .90 and that it 
may be estimated by attenuating the obtained correlation. The resulting 
7 of .94 leads to the conclusion that if the measures were more reliable 


128 С. Н. Lawshe, Jr., E. E. Dudek, and R. F. Wilson 


the correlation between them would approach 1.00 and that, in effect, they 
both measure the same variable or variables. 

Validity. This study does not attempt to answer any questions or 
draw any conclusions regarding the validity of either system, since neither 
can be considered a criterion against which to evaluate the other. Be- 
cause it would seem profitable to conduct validity studies using actual 
wage data as a criterion, research along these lines is presently being 
planned. 

Summary and Conclusions 


Twenty analysts rated forty job descriptions by two job evaluation 
systems, the NEMA System and a Simplified System designed by the 
senior author. Intercorrelations were obtained between the item ratings 
made and Thurstone’s centroid method of factor analysis was used to 
determine the fundamental factors accounting for these intercorrelations. 
The following conclusions are supported. 

1. Five factors were found which seem to account for the elements 
considered in the two systems. These factors were tentatively identified 
as: Skill Demands (General), Supervisory Demands, Job Character- 
istics—Non-Hazardous, Job Characteristics—Hazardous, and Job Re- 
sponsibility. 

2. It seems quite possible that other factors not identified here, but 
peculiar to certain industries or job families may be isolated in future 
studies. 

3. It appears from the available evidence that, for accurate and com- 
plete job evaluation, fewer factors are necessary than are usually used in 
present job evaluation systems. The desirability of further investiga- 
tions to isolate and more clearly define these factors is indicated. 

4. No conclusions about validity can be drawn from this study due 
to the lack of a suitable criterion. Investigation of this problem also 
appears desirable. 

} 5. Although short job evaluation systems consisting of only a few 
items may be statistically and logically justified, it may be practically 
advantageous to include additional items in the system which will make 
it more acceptable to raters and to employees. 

Received October 10, 1947. 


References 


L MTM J. P. Psychometric methods. New York: McGraw-Hill, 1936, pp. 457- 

2. Job descriptions for industrial service and maintenance jobs. National Job Deserip- 
tion Series, U. 8. Department of Labor, U.S.ES., Washington, D. C.: June, 1989. 

3. Job descriptions for job machine shops. National Job Description Series. U. 8. 
Department of Labor, US.ES., Washington, D. C.: April, 1938, 


Studies of Job Evaluation. 7 129 


4. Job rating: definitions of the factors used in rating jobs—hourly rated occupations. 
Chicago: Industrial Relations Department, National Electrical Manufacturers 
Association. 1938. 

5. Lawshe, C. H., Jr., and Satter, G. A. Studies in job evaluation. 1. Factor analy- 
sis of point ratings for hourly-paid jobs in three industrial plants. J. appl. 
Psychol., 1944, 28, 189-198. 

6. Lawshe, C. H., Jr. Studies in job evaluation. 2. The adequacy of abbreviated 
point ratings for hourly-paid jobs in three industrial plants. J. appl. Psychol., 
1945, 29, 177-184. 

T. Lawshe, C. H., Jr., and Maleski, A. A. Studies in job evaluation. 3. An analysis 
of point ratings for salary paid jobs in an industrial plant. J. appl. Psychol., 
1946, 30, 117-128. 

8. Lawshe, C. H., Jr., and Alessi, S. L. Studies in job evaluation. 4. Analysis of 
another point rating scale for hourly-paid jobs and the adequacy of an abbrevi- 
ated scale. J. appl. Psychol., 1946, 30, 310-319. 

9. Lawshe, C. H., Jr., and Wilson, R. F. Studies in job evaluation. 5. An analysis 
of the factor comparison system as it functions in a paper mill. J. appl. Psychol., 
1946, 30, 426—434. 

10. Lawshe, C. H., Jr., and Wilson, R. F. Studies in job evaluation. 6. The reliability 
of two point rating systems. J. appl. Psychol., 1947, 31, 355-365. 

11. Wright, В. E. A factor analysis of the original Stanford-Binet scale. Psycho- 
metrika, 1939, 4, 209-220. 


Improving the Selection of Linotype Trainees 


George C. Beamer 
North Texas Teachers College 


Lawrence D. Edmonson 
Counseling Bureau, University of Missouri 
and 


George B. Strother 
Duluth Branch, University of Minnesota 


In order to meet the increased educational demands produced by the 
GI Bill of Rights, the University of Missouri has inaugurated several non- 
collegiate vocational courses for veterans. These courses are in fields 
where demand for skilled personnel is high. Institutional or on-the-job 
training facilities seem inadequate to meet the demand. Such a course 
is the one semester linotype operators school at the University of Missouri. 
The school is integrated with an on-the-job training program on com- 
pletion of the school. The situation in the linotype school is ideal for 
the use of selection tests since there is a waiting list, of students and а 
high rate of turnover. Turnover is used here broadly to include those 
who do not complete the course, those who complete the course with in- 
ferior ratings, and those who prove unsatisfactory on the job. 

Much of the demand for operators in the area is in small rural shops 
and the demands on the operator, besides linotype operation, often in- 
clude: the maintenance and repair of the machine; the duties of com- 
positor; proofreading; and even reporting, selling, and editing. It was 
not feasible to obtain criteria on factors other than linotype operation in 
this study. Since it was evident that minimum standards on operation 
would improve selection, even though it neglected other significant 
variables, this investigation knowingly neglected these other factors which 
are of varying degrees of importance in different types of shops. 

Two consecutive groups totalling twenty-nine persons which took the 
one semester course were tested at the beginning of training using the 
following battery: a. Schrammel-Brannon revision of the Army Alpha; 
b. The Kuder Preference Record ; €. The Minnesota Vocational Test for 
Clerical Workers; d. The MacQuarrie Mechanical Aptitude Test; e. The 
Revised Minnesota Paper Form Board; and f. O’Connor-Tweezer Finger 
Dexterity. 

130 


Improving the Selection of Linotype Trainees 131 


The criteria studied were terminal grades in the course, lines of type 
set per hour at the end of the course, and errors made. The grade cri- 
terion yielded negligible correlations with test results due to lack of ade- 
quate grade scatter and unreliability of grades. 

Speed and error criteria were combined according to the following 
formula: Score equals lines per hour minus twice the number of errors. 

Errors were weighted in this manner because of the fact that accuracy 
is related to speed not merely in terms of lines culled but in terms of 
spoilage and repeat settings. This criterion was satisfactory in the 
opinion of journeyman operators instructing in the course. 


Table 1 
Means, S.D.'s and Product-moment r’s of Test Scores with Criterion. N = 29 


Test r with Criterion Mean B 
Army Alpha .62 168.8 317 
Kuder 
Mechanical :31 79.3 17.1 
Computational .02 35.9 9.0 
Scientific .05 57.8 11.2 
Persuasive —.27 65.4 18.4 
Artistic 18 47.0 11.8 
Literary —.08 64.2 17.3 
Musical :08 17.9 10.5 
Social Service —.41 58.2 11.8 
Clerical —.11 61.1 57 
Minn. Test for Clerical 
Workers 
Number .54 105.9 17.9 
Name 57 104.1 23.8 
Minnesota Paper Form Board .29 38.7 8.9 
MacQuarrie 
Tracing 24 34.7 6.7 
Tapping A8 34.8 9.1 
Dotting 31 19.2 3.0 
Copying 49 5. 12.4 
Location 19 274 6.9 
Blocks 37 17.6 5.4 
Pursuit Al 26.2 6.1 
Total 59 67.8 78 


sn d Tob dor MPO eS 


The number on whom complete data were available was twenty-nine. 
This includes all but two persons who took the initial battery. One of 
those who did not complete the course died. The reason for the other 

ор is not known. Thus the attenuation of correlations due to drops 
resulting from inaptitude is negligible. 


132 С. C. Beamer, L. D. Edmonson, апа 6. B. Strother 


Table 1 shows the zero order correlation coefficients of test sco 
with eriterion. From Table 1 it will be noted that the most significan 
correlations are the Army Alpha, the Kuder Mechanical, Kuder P 
suasive, Kuder Social Service, the two sections of the Minnesota 
for Clerical Workers, the Minnesota Paper Form Board and the Mi 
Quarrie Total Score. In order to keep the multiple-regression equati¢ 
within bounds, it was decided to work with only five of the above score 
All five tests were retained in the final analysis and the highest coeff 
cient for each test was selected (where several scores were obtained! 
An exception to this was made on the Kuder where the highest positin 
coefficient was selected and the higher negative coefficient was bypasse 
because it was felt that the negative interest relationship would be dif 
cult to utilize in dealing with applicants or counselees. Multiple correla 
tion analysis by the Doolittle method! was carried out on the followin, 
scores: a. Army Alpha; b. Kuder, Mechanical; c. Name section of Min 
nesota Clerical; d. MacQuarrie, Total Score; and e. Minnesota Papx 
Form Board. | 

O'Connor Finger Dexterity showed a product-moment r of .31 an 
might be desirable in such a battery, but was dropped because data wen 
incomplete. 

Table 2 shows intercorrelations of the five tests and were the basis 0 
the multiple coefficient. The multiple correlation analysis gave a coeffi: 


Table 2 
Inter-Correlations of Variables Used to Determine the Multiple Regression Equation 


Minn. 
HA Army Kuder Clerical | 
Criterion Alpha (Mech. (Name) М.Р.Е.В 


Army Alpha 62 

Kuder (Mech.) 31 —.03 

Minn. Cler. (Name) 57 .80 —.28 

Minn. Paper Form Board 29 35 +17 32 
MacQuarrie (Total) 59 36 


11 :37 07 | 


cient of +.82 with the criterion. The multiple regression equation ап. 
standard error of estimate are as follows: 


x ДОШ Xa + 4080 Xa + 22511 X, + 3205 X, + „8151 X, — 02.78 


S. E. est. В = + 10.64. 


E Guilford, J. P. Psychometric methods, New York: McGraw-Hill, 1936, рр. 39% 


Improving the Selection of Linotype Trainees 133 


1 = Army Alpha, Raw Score. 
2 = Kuder Mechanical, Raw Score. 
3 Minn. Voc. Test for Clerical Workers (Name Comparison), Raw 
Score. 

4 = Minn. Paper Form Board, Raw Score. 
5 = MacQuarrie, Total Raw Score. 

Table 1 gives means and standard deviations of raw scores for the 
group. 

Several facts of interest come out in an inspection of the regression 
equation. Since the raw scores differ widely in variability, the multi- 
pliers in the equation do not give any indication of the relative contri- 


Table 3 
Scattergram of Predicted Scores on Obtained Scores (R = .82) 


Predicted Scores 

Actual 40- 45- 50- 55- 60- 65- 70- 75- 80- 85- 90- 95- 100- 105- 110- 
Score 45 50 55 60 65 70 75 80 85 90 95 100115 110 115 Total 
115-120 1 1 
110-115 1 1 
105-110 1 1 2 
100-105 Т 15 2 
95-100 1 1 2 
90-95 Dd 2 
85-00 1 2T 4 
80-85 ST 3 
75-80 1 1 2 
70-75 1 1 2 
65-70 5 
P 2 2 1 б 
55-60 1 1 
50-55 0 
45-50 1 1 
40-45 1 1 
QUA 1 1 ого вав о гав 


ананан Io E EAE 
bution of any given test to the prediction of the criterion. The mechan- 
ical interest factor and the Paper Form Board make substantial contri- 
Utions to the equation in spite of relatively low r's. They seem to meas- 
Ure phases of the total picture not otherwise covered to any great extent. 
е Army Alpha, although it shows the highest r of any of the tests, 
Apparently overlaps with several of the other tests, so that its contri- 
ution is only nominal. The MacQuarrie and the Minnesota Test for 
erical Workers (names) seem to be the most useful tests if a shorter 
battery were desired. ' 


С 


134 G. C. Beamer, L. D. Edmonson, and G. B. Strother 


Since the number of cases is small (N = 29), some variability is to 
be expected as subsequent data are accumulated. The present equation 
represents a working formula which is statistically reliable but tentative 
with regard to the actual figures obtained. 

Table 3 represents a scattergram of scores predicted by the multiple 
regression equation and the obtained scores. Applicants for admission 
to the course are to be tested with the battery finally selected and pre- 
diction of their standing will be made. Оп the basis of this plus an in- 
terview blank, selection of students will be made. It can be readily seen 
that, given any particular selection ratio, the selection of operators should 
be substantially improved by use of the above diagram. The extent of 
improvement will depend upon the favorableness of the ratio involved. 
Given the correlation value, the number of applicants and the number 
of individuals who will be admitted, the Taylor-Russell tables? can be 
applied and the per cent of improvement determined. 


Received Aug. 25, 1947. 
3 Tiffin, J. Industrial psychology. New York: Prentice-Hall, 1946, рр. 363-367. 


Per Cent Increase in Output of Selected Personnel 
as an Index of Test Efficiency 


R. F. Jarrett 
University of California 


It recently occurred to the writer that a very useful measure of the 
uacy of а program of testing for employee selection (or classification) 
d be the ratio of the mean output of a group selected on the basis of 
eir high test-scores to the mean output of an unselected: group. A 
Search of the literature revealed that Richardson (8) had called attention 
tothe desirability of describing the efficiency of a testing program in these 
terms. He has suggested a method of predicting the improvement in 
output of selected workers given the validity coefficient, the selection 
Tatio, p (the proportion of applicants selected), and an estimate of a 
tant, k, which is the ratio of the mean output of the upper P%? of 
“unselected personnel to the mean output of the lower (100 — P)% of 
“unselected personnel. 

He computes the value of E defined by the equation: 


_ rk = Dd - 
E- PEF’ a 


e ris the validity coefficient of the test, k is the ratio previously de- 


; and p is the selection ratio, or the proportion of applicants to be 
elected. 


His solution is based upon the relations between the point correlation 
coefficient and the various proportions in the marginal totals of a four- 


—— Тһе term “unselected” must be interpreted throughout this paper to mean, “the 
“Members of that population of individuals who apply for the job in question and who— 
‘When individuals were needed for the job in question—would have been put to work 
- Without further regard for their qualifications before a testing program was initiated." 
An necessary thus to qualify the expression because numerous selective factors make 
"3$ “unselected” population for the job of janitor in a New York office building, for 
‘example, quite different from the unselected population for the job of hosiery looper in 
ll in Illinois. It is desired here to evaluate the efficiency of the test; thus “un- 
" must be carefully qualified as indicated. 
oughout this paper the upper-case P will be used to denote the per cent selected, 
i the lower case p denotes the proportion selected. Thus if 85% of applicants are 
E „Р = 85, but p = 0.85. The letter “g” similarly will denote the per cent 
When upper case, the proportion rejected when lower case. 


135 


136 R. F. Jarrett 


fold table. A more general solution is available, and it is the purpose of 
this paper to call attention to some implications of Richardson's solution 
and to present the general solution together with tables describing test 
efficiency under various circumstances. 

Before proceeding to the general solution, however, it is of interest to 
note some implications of Richardson's solution and the example to which 
he applies it. It should perhaps be made explicit here that the primary 
requirement which must be met in order that any method of the type 
here considered may be used is that the criterion must be one to which 
the coefficient of variation may be legitimately applied; that is, the eri- 
terion must yield scores in equal units measured from an absolute zero. 
For it is clear that the ratio К may assume any value between unity and 
infinity by the arbitrary location of a relative zero. The method would 
thus appear to be without utility in the case of such criteria as, for ex- 
ample, ratings of superiors. This imposes a restrietion upon the use of 
methods of the type under discussion, but in considering production costs, 
management is concerned chiefly with those criteria to which the method 
is applicable: 

In his illustrative example, Richardson has assumed the ratio of the 
mean output of the upper twenty-five per cent of workers to the mean 
output of the lower seventy-five per cent to.be 3.5. In view of the fact 
that in very few reported cases does the ratio of best to poorest individual 
worker reach this value (3, p. 35), and in view of the general importance 
of this ratio to the determination of E, it is of interest to determine (even 
though this should require some assumption about the form of the dis- 
tribution) a general expression for this ratio. This derivation is quite 
simple under the assumption that the distribution of outputs is normal 
or can be described reasonably satisfactorily by a mutilated normal curve. 

Consider first the ratio & for a normal distribution of unit standard 
deviation and mean с. In this situation the absolute zero of production 
is c units below the mean. So long as c is about 2.6 or greater there will 
be little loss of generality in assuming the distribution to be strictly not 
mal with unit area. 

Let us write: 


= За = 


Oz 


£i n Lg (2) 
and the corresponding equation for Z, In the above expression the 
variance of 2 is assumed unity, and the bars indicate means. Thus 21% 
the deviation of the mean of the upper P% of the cases from the general 
mean, 2, is the deviation of the mean of the lower Q% (100— Р) of the 
cases from the general mean, X; and X, are the respective raw-score 


О 


Selected Personnel and Test Efficiency 137 


es of those means, and c is the mean of the total distribution. We 
write for the ratio sought: 


_ 08 + 2\ 
К == TA (3) 
ow, by a well-known relation, if X is normally distributed: 
2 
Zi bas doces 
p 
(4) 
a 
4 
e 2 is the height of the dividing ordinate. We then write: 
2 
es 
к= —Р. (5) 
1. Wer: 
EA. 4 
Clearing, we obtain: 
J к фића а, (6) 
ї с —2 р 


Richardson has noted the desirability of deriving formulae of the type 
presents without making assumptions as to the form of the distri- 
ions involved, but often it is necessary to make such assumptions in 
Order to obtain insight into the magnitude of the value of E to be ex- 
-Bected; some such assumption must be made if k is to be estimated for 
А general case, though the observed value of k may be employed where 
18 available. 
" It is of interest to observe that were the distribution of worker out- 
Puts essentially normal, the value of k used by Richardson in his illus- 
; тайуе example could not, as а matter of fact, be approached. Substi- 
tution of the appropriate values of p, q and z in (6) and assuming the 
a an of the distribution to be three standard deviations above zero 
(ie, € = 3 or = .33, where v is the coefficient of variation), the value 
otk for a twenty-five per cent—seventy-five per cent division will be 
Ош to be only 1.65—as opposed to the value of 3.5 assumed in the ex- 
Ample, For this situation the ratio of best to poorest worker would be 
y large indeed—of the order of 11 to 1—and if с be taken somewhat 
ater in order to reduce this ratio to a value more nearly in agreement 
: observed values of this ratio, the value of k becomes smaller. Sub- 
uting the hypothetical value of k in formula (1) yields a value of 0.22 
E, à value which, though less impressive than the 0.60 of Richardson's 


138 К. F. Jarrett 


example, would appear to be more nearly representative of the effective- 
ness of the selective procedure. 

Production figures—for understandable reasons—rarely find their way 
into print. A cursory search, however, revealed two sets of such data, 
and the writer has found the ratio of the mean outputs of the upper fourth 
to the lower three-fourths of employees for these two sets. 

For 99 hosiery loopers (11, p. 7) of more than one year's experience 
the upper one-fourth of the workers averaged 21.3 units output per hour, - 
while the lower three-fourths averaged 16.2 units per hour. The value of | 
k is thus 1.32, somewhat less than the value for the ideal normal distri- 
bution. 

Such a selected group, however, is not typical of the employment 
situation. Data are presented in the same source (11, р. 6) for 203 - 
loopers of varying experience. This distribution yields a k-ratio of 1.64, 

almost exactly the value which would be yielded by the normal curve. 
It may be noted that this agreement between theoretical and empirical 
values is obtained despite the fact that the distribution had a secondary 
mode in the lowest interval. 

It would not be unreasonable to suppose that in the employment 
situation the distribution of worker output might be somewhat positively 
skewed with cases tending to pile up toward the low-production end. А 
development along lines similar to the one presented above leads quite 
straight-forwardly to the following equation for the half-normal curve: 


- ср + 22,2 4 (7) 


— са + 798 — 9z,4 p’ 


where z, is the height of the ordinate dividing the multilated normal 
distribution into an upper part of P% of the total area and a lower part 
of Q% (100—P)% of the area; this will be the ordinate dividing the 
total normal curve into an upper part containing P/2% and a lower part 
containing Q/2%. The value .798 is simply double the modal ordinate. 
It is not reasonable to assume the origin of this curve to fall at the mean 
of the total distribution, for this would imply that the modal unselected 
worker produced just nothing. If c be set equal to 1 (which yields for 
the mutilated distribution approximately the same value of v assumed for 
the normal case aboye), however, the selection ratio, p, remaining .25, 
for this distribution becomes 1.74, a value only slightly more favorable 
than that obtained’ on the assumption of normality. 

Also possible is a degree of skewness intermediate between the Вул!” 
metry of the full normal curve and the extreme skewness of the һа!“ 
normal curve. The ratio / may be estimated in an equally simple manner 
for the distribution obtained by truncating the normal distribution 80 83 


О 


Selected Personnel and Test Efficiency 139 
0 leave the upper three-fourths of the curve. In this instance, k may 


е shown to be: 
k = ___3ер + За __ 0 (8) 
Зед + 1.272 — 425,4 р 3 


ere 22, is the height of the ordinate dividing the mutilated curve into 
e portions p and g; it divides the total curve into the portions 1p and 


"Assuming the same value for the selection ratio and setting с equal to 
317 which will maintain v = .33, k is found to be 1.73. 
n view of the nature of the selective factors at work it would appear 
dle to consider the value of k for negatively skewed distributions. 
t appears clear that the coefficient of variation is of primary impor- 
псе in the determination of k. It will be noted that for all three of the 
3 of distributions considered, k is of the same order of magnitude 
0 long as the coefficient of variation is maintained constant. This 
ts that the ratio k may be relatively insensitive to the form of the 
ution, a finding of particular interest in view of the conditions 
underlying the more general derivation to which we now turn. 
“At the turn of the century Karl Pearson (5) was able to derive ex- 
pressions for the means, standard deviations, and intercorrelations ob- 
among several organs when the individuals possessing them had 
en selected with respect to one or more other organs. At that time he 
it necessary to assume all distributions, selected as well as un- 
d, to be normal or Gaussian in form, although in the light of the 
"recent demonstration of Yule that the various relationships involving 
е correlation coefficient, could be derived without recourse to this as- 
ption, he expressed the opinion that his findings might later be found 
to depend upon that assumption. This opinion he himself verified 
"167 years later (6), observing that “the method is absolutely independ- 
"M. of Gaussian theory . .. but it does assume that linearity applies 
un the degree of useful approximation.” Much later during World 
П, Cyril Burt (2) in England and workers in the Aviation Psychology 
ш of the U. S. Army Air Forces (10) had occasion again to concern 
1086 уез with the effects of selection on intercorrelations—this time 
1 correlations between psychological tests and criteria of job profi- 
бу. Referring to Pearson’s early work, Burt (2) derived expressions 
6 influence of selection without recourse to the assumption of nor- 
^; apparently in ignorance that Pearson himself had done so earlier. 
a of interest to note in passing that although men working with the 
Program of both the British and American armies found it neces- 
to employ the results of this early study of Pearson’s in connection 
© correction of validity coefficients for variability of the groups 


С 


140 R. F. Jarrett 


taking the tests, they appear not to have recognized that the results there 
provided also form the basis of a very useful criterion of test efficiency. 

It would be redundant here to repeat either the general derivation 
or that for the special case of interest. It will suffice to observe that if 
the symbols be defined as noted below, the difference between the mean 
criterion score of those individuals selected by the "aptitude test" and 
the general mean of the unselected group on the criterion is given by: 


n- 9 = 042-2), (9) 
where А 
Z = mean of scores on selective device for all candidates (unselected) 
Z, = mean of scores on selective device of those candidates selected 
7 = mean of criterion scores for all candidates (unselected) 
ў, = mean criterion score of those candidates selected 
с. = standard deviation on selective device of all candidates (un- 
selected) 
су = standard deviation of criterion scores of all candidates (un- 
selected) 


та = product-moment coefficient of correlation between selective de- 
vice and criterion. 


It is desired to express the gain through the use of the test as a рго- 
portion of the mean performance of an unselected group. We thus write: 


b rn (===), (10) 


Writing v, for the ratio of the standard deviation of y to the mean of y 
(the coefficient of variation), (10) becomes 


Во па (85), (1) 


y z 


The simplicity of (11) is gratifying. This equation expresses the fact 
that 100 men selected on the basis of their high scores on a test will pro- 
duce 100Е'% more than 100 men selected without reference to their test 
score. It will be noticed that the relative increase in mean criterion score 
is directly proportional to three familiar quantities: the validity coeffi- 
cient, the coefficient of variation of the criterion, and the relative devia- 
tion of the mean test score of accepted applicants from the general test 
mean. It should be emphasized that all these data are available im- 
mediately after the first validation study of the selective instrument. 

* Note that assuming the z-distribution to be a dichotomized normal distributio | 


substituting the value of (2, — 2)/oz obtained from normal curve theory, and solving — 
for r in this equation yields a formula for the biserial correlation coefficient. 


Selected Personnel and Test Efficiency 141 


"The appearance of the coefficient of variation of unselected criterion 
scores deserves comment at this point. It will be noted that Richardson’s 
3 found to be dependent upon the value of v and apparently insen- 
e to distribution form. The coefficient of variation is the variable 
nitude) analogue of the unselected-success-ratio of Taylor and 
Russell.‘ The fact that v, enters (11) as it does implies that for those 
situations in which among unselected personnel the good producers are 
telatively little more productive than the poor ones, no testing program 
be expected to increase the efficiency of operation unless it is possible 
to take advantage of extremely rigorous selection. 

_ It is to be noted that no assumptions have been made as to the nature 

of any of the distributions involved; попе are necessary if the psychologist 
i well acquainted with the nature of the distribution of test scores (x) 
in the population to which it will be administered as a selective device. 
This familiarity will permit him to estimate the last term of (11) with 
_ satisfactory accuracy for any given selection ratio. Inasmuch, however, 
аз many psychological tests yield distributions (in the situations to which 
they are applied) which approximate to the normal distribution, it will 
‘Rot be amiss to employ this assumption for the purposes of illustrating 
_ this criterion and suggesting the magnitude of the relative increase in 
output to be expected from testing programs. 
If ry and vy are known or assumed to have certain values, the evalu- 
_ Bion of E' from (11) requires only a determination of (Z, — 2)/oz. This 
value is obtained, on the assumption of normality, from the first equation 
of (4) where 2 is the height of the ordinate cutting off the upper P% 
| ion ratio) of the curve. Formula (4) has been evaluated and 
Р forall values of Р (7). These tables thus provide the information 
a to permit the estimation of the effects of a selection program. 
_ Substituting (4) in (11), we have 


Ё E= ra (ni): (12) 


. “In 1939 Taylor and Russell (9) called attention to the fact that although the low 
ity coefficients often obtained in industrial testing situations would probably never 
‘Permit Teliable prediction of individual performance on the criterion, it is possible to 
, determine with some accuracy the proportion of a selected group who would be "'success- 
on а criterion. They called attention to the fact that even for low validity coeffi- 
- A15, if only a small proportion of candidates were selected, the proportion of those 
“а Who would be “successful” could become very high indeed. They introduced 
- ae, erm “selection ratio” and emphasized the necessity of taking this proportion into 
штап any interpretation of the effectiveness of a selective program. Their method 
| definition of "successful" criterion performance in terms of what the present 
ve calls the "unselected-success-ratio," or the proportion of unselected individuals 
© fied criterion performance exceeds a critical value below which workers will be classi- 
. “= unsuccessful and above which they are classified as successful. 


E 


у 
1 
6 
E 


142 R. F. Jarrett 


From (12) it appears that—for a given value of v, and p—the relative 
increase in the effectiveness of selected workers over that of unselected 
workers is directly proportional to the correlation coefficient. Thus the 
validity coefficient itself—rather than the coefficient of alienation, the 
index of forecasting efficiency, the coefficient of determination, or other 
functions of the correlation coefficient—may be considered adequately 
to reflect the benefits to be expected of а testing program.’ It is unfor- 
tunate (in the sense of introducing complication), however, that the close- 
ness of the relation between test and criterion is not the only factor influ- 
encing the efficiency of the testing program, so that the validity coeffi- 
cient does not tell the whole story. It is of interest to note that E' is 
also directly proportional to v, and that the ratio z/p is positively accel- 
erated with increasing rigorousness of selection, so that changing p from 
25% to 20% improves E' more than the change from 50% to 45%. 

Equation (11) would seem to be preferable to (1) due to the greater 
generality of (11) resulting from the substitution for the ratio k of the 
coefficient of variation, v, which assumes but one value for all values of 
the selection ratio. In the case of К, а separate value must be computed 
for each value of the selection ratio, p, while (11) requires only the value 
of v for the determination of the values of E' for a series of values of p. 
Both equations should be expected to yield essentially similar values of 
E for a given situation, but (11) would appear to be the more dizect ap- 
proach. 'The demonstration of the insensitiveness of k to distribution 
form suggests, moreover, that considerable confidence may be placed in 
the simpler equation (12), even though the distribution of test scores 
is not strictly normal. 


The use of (12) would, however, be much simplified by the availability 
of tables of the value ъз which appears in parentheses їп (12). For this 


purpose and to give numerical illustration of the improvement to be ex- 
pected under various circumstances, the accompanying table has been 
prepared showing for several values of the selection ratio and several 
values of the coefficient of variation, the value of one hundred times this 
funetion. Multiplication of the tabled values by the validity coefficient 
yields directly the per cent increase in mean productivity of selec 

versus unselected workers. As an illustration of the use of the table, 
consider the data on hosiery loopers presented earlier. For the 203 un- 
selected loopers the coefficient of variation was 0.37. Assuming a selec- 
tion ratio of 0.25 and a validity coefficient of .3, interpolation in the table 
reveals that the selected personnel will produce about 0.3 x 47% 9: 


* This’point has been pressed—on quite different grounds—by Johnson (4) and by 
Brogden (1). 


Selected Personnel and Test Efficiency 143 


about 14% more than unselected personnel. In a mass-production sys- 
tem, а 14% increase in production would appear well worth striving for, 
and it does not appear unreasonable to expect that the cost of initiating 
and maintaining the testing program will be more than paid for by such 
an increase. Formula (1) yields a value of 12.4% from these data, a 
value in good agreement, as is to be expected. 

There remains one matter to be treated before the formulas are left 
to the use of management’s test-consultants. This is the matter of the 
reliability of the predicted efficiency. Unfortunately this is not a simple 
matter, and the writer is not at present prepared to present the definitive 
answer to the problem. About the solution, however, some things seem 
dear. If the population of unselected criterion scores were perfectly 
known and the correlation between test and criterion known for the popu- 
lation, it is apparent that any particular random sample of the unselected 
population of workers would yield for a given selection ratio (substituting 
the parametric value of the correlation coefficient and the coefficient of 
variation in (11)) a random sample from the population of selected cri- 
terion scores. The standard error of the means of such random samples 
Would, of course, be the ratio of the standard deviation of the population 
of selected criterion scores to the square root of the number of individuals 
selected. An approximation to the standard error of the estimated mean 
may thus be obtained from Pearson’s (5) expression for the standard de- 
Viation of the selected criterion scores. If У be considered the criterion, 
x the test, с. and а, the standard deviations of the unselected popula- 
tion, 2 the standard deviation of the population of Y’s remaining after 
direct selection of X ’s, з, the standard deviation of the X’s of the selected 
&oup, the standard deviation sought is given by the formula: 


a =o (1- (1-2) ме). (18) 
Oz 
Now if the distribution of test scores be assumed normal to a first ap- 


. z 2 è 
proximation, the ratio = may be determined from the incomplete normal 
с: 


Moment functions for any value of the selection ratio. The writer has 
termined this ratio for several values of р and for several values of r. 
‚18 arithmetical work reveals that for the range of correlation coeffi- 

“ents likely to be met with in practice (.1 to .5) the standard deviation 

of the selected Ү?в carinot be expected to be much less than .9 the standard 

deviation of the unselected Y's for even the most rigorous selection. It 

18 therefore suggested (in view of the approximate nature of this standard 

error, due to the failure to allow for the sampling errors in r or in the va- 

"ous standard deviations involved) that the standard error of the pre- 


144 R. F. Jarrett 


dicted mean output be taken as approximately equal to o,/ Np where N 
is the total number of cases in the unselected group of applicants and p 
the proportion selected. 


Summary and Conclusions 


1. Attention is called to the desirability of being able to describe the 
efficiency of a selective testing program in terms of the ratio of the mean 
production of a group selected by the program to the mean production 
of an unselected group. Attention is also called to Richardson's ap- 
proach to the problem in terms of the relations between the point correla- 
tion and the four marginal totals of the four-fold table. 

2. A literal expression is derived (under the assumption of normality 
of distribution) for the ratio (Richardson's k) of the mean of the upper 
РФ of a group to the mean of the lower Q% [(100 — Р)% ] of the group. 

3. Expressions for this ratio are derived for the skewed distributions 
yielded by the upper half and the upper three-fourths of а normal dis- 
tribution, and К is shown to be of the same order of magnitude for all 
three of these forms so long as the coefficient of variation is held constant. 
It is shown to be sensitive to changes in v. 


'Table 1 


The Value of the Quantity 1000 = for Various Values of (the Selection Ratio) 
and v (the Coefficient of Variation) * 


v Selection Ratio, p 


5% 10% 20% 30% 40% 50% 60% 70% 80% 90% 95% 
Ol 2.00 176 140 116 097 080 064 0.50 0.35 0.20 011 
05 10.31 877 699 580 483 399 322 248 1.75 0.98 0.54 
10 20.68 1755 1400 1159 966 798 644 497 3.50 1.95 1.09 
20 4125 35.10 28.00 23.18 19.32 15.96 1288 9.93 7.00 3.90 217 
30 6188 5265 41.99 34.77 28.08 23.04 19.32 14.90 10.50 5.85 3.26 
40 8251 70.20 5599 46.36 38.63 31.92 25.76 19.87 13.99 7.80 432 


see 

* Here v is taken simply as the ratio of standard deviation to mean. Note that since 
the quantity tabled is a linear function of v, the table may be extended to any value of ” 
by multiplying the value for v = .10 by the appropriate quantity. 


4. A more general solution to the problem of the ratio of the mean 
output of a group of selected workers to the mean output of a group 0 
unselected workers is presented from Pearson’s general solution to the 
problem of the effects of selection. 

5. The efficiency of a testing program so defined was shown in this 
derivation to be directly proportional to: (a) the validity coefficient; (b) 


Selected Personnel and Test Efficiency 145 


the coefficient of variation among the unselected criterion scores; (c) and 
to a function of the rigorousness of selection, or the selection ratio. 

6. The formula presented as the general solution ((11)) is dependent 
solely upon the assumption that regression is linear to a useful degree of 
approximation. 

7. A table is presented from which, knowing the validity coefficient, 
the selection ratio, and the coefficient of variation of the unselected cri- 
terion scores, the per cent improvement resulting from selection may be 
estimated on the assumption of normality, and evidence is presented 
which suggests that failures of this assumption such as might be met 
with in practice will affect the tabled values only slightly. 

8. Some suggestions are made as to a reasonable value for the standard 
error of the estimated mean of selected criterion scores. 


Received July 9, 1947. 


References 


- Brogden, H. E. On the interpretation of the correlation coefficient as a measure of 
predictive efficiency. J. Educ. Psychol., 1946, 37, 65-76. 

Burt, Cyril. Statistical problems in the evaluation of army tests. Psychometrika, 
1944, 9, 219-236. 

Hull, C. L. Aptitude testing. New York: World Book Company, 1928. 

Johnson, Н. M. General rules for predicting the selectivity of a test. Amer. J. 
Psychol., 1942, 55, 436-442. 

Pearson, K. Mathematical contributions to the theory of evolution. XI. On the 
influence of natural selection on the variability and correlation of organs. Phil. 
Trans. Roy. Soc. London, 1908, 200, 1-66. 

Pearson, К. On the general theory of the influence of selection on correlation and 
variation. Biometrika, 1912, 8 434—443. 

Pearson, К. (ed.). Tables for statisticians and biometricians. Part II. Cambridge 
University Press, 1932. 

Richardson, М. W. The interpretation of a test validity coefficient in terms of 
RSA efficiency of a selected group of personnel. Psychometrika, 1944, 9 

5-248. 

Taylor, Н. C., and Russell, J. T. The relationship of validity coefficients to the 
practical effectiveness of tests in selection: Tables and discussion. J. appl. 

по x, 20010, 1939, 23, 565-578. х 

0. Thorndike, R. L. Research problems and techniques. Report No. 3 Army Air 

H Forces Aviation Psychology Program Research. Reports, 1947. 

“Зп, J. Industrial psychology. New York: Prentice-Hall, 1942. 


- 


~ 


л a рә 


BOLT UNTER. 


= 


Inter-Relationships of Selected Personnel Functions bi 


Elmer R. John 
University of Minnesota 


The author was pressed by two circumstances to construct the ac- 
companying chart, and it is presumed that many personnel directors as 
well as faculty members who teach personnel administration face these 
same situations. The first was the problem of presenting a discussion 
of personnel functions to business management in a way that would make 
it clear as to why so many procedures are used in present-day personnel 
programs. Personnel directors often submit to management a list of 
procedures that they would like to put into effect, but they use lengthy 
and confusing verbal explanations to justify the items in the list. Fre- 
quently the specialized terminology is new to top management, and the 
verbose explanations that accompany them all too often stimulate re- 
sistance to their acceptance. A clear outline of how personnel practices 
are related to each other and to the over-all objectives of the organization 
is what appears to be needed. Management says, ‘‘You’re the experts. 
Show us clearly what should be done and why.” This chart was made 
primarily as partial answer to this question, but it might also be useful 
to personnel managers who are establishing new programs and who want 
an outline of the connections between various phases of a functioning 
personnel department. 

The second need for this chart was encountered in the teaching of а 
university course in personnel psychology. One important objective is 
to teach principles, methods, and facts related to job analysis, job evalua- 
tion, selection and placement, merit rating, etc. It is quite another task 
to convey to students a clear understanding of the inter-relationships of 
all these procedures. In this connection, it was found that, when а 807 
lected list of personnel functions is given to a class and the students аге 
asked to produce their own charts of the inter-relationships, the project 
stimulates student participation in the discussion and leads to better 
understanding of the topic. 

The accompanying chart (Figure 1) is a modification of the original 
that was used in conference with management. New and more detaile 

* Midland Cooperative Wholesale, Minneapolis, Minnesota, for whom the writer 
prepared this chart encouraged and sponsored the early publication of this paper 


The author acknowledges indebtedness to Donald G. Paterson, Editor, and to Philip Н. 
Kriedt, University of Minnesota for helpful suggestions and criticisms. 


. 146 


147 


Inter-Relationships of Selected Personnel Functions 


"suorjunj [ouuosiod pejoo[os jo sdrqsuorjwjor-iejur SutMoqs ви) "T “OL 


| uoi esit 
Pu PY 


42 


pos џоцошој qor 


25 
ООО ООО ie um 


£21412ad. 
мођезбепе бшш. КОЗ аог si > да 


ЕТЕДИ 7 еы 
tueib044 Ауәзес ш ер дел; 
(sdiysuonejas узәл!р 552] moys бәш! payog) 
Биттебтер әл//22//02 pue suoyesay тоогу билолуоху 
suoounj jauuosiag pej»e|og јо  sdiusuoneja1-121u| 


148 Elmer R. John 


relationships are suggested by each person who looks at it critically, but 
on certain other relationships there is consistent agreement. Since clarity 
is the underlying purpose of making the chart at all, the number of lines 
on it have been kept to à minimum. Many more lines of indirect and 
tenuous relations could still be shown. Pigors and Myers, for example, 
вау that job analysis is related to "selection and placement, training, 
transfer, upgrading, and promotion, and in making wage surveys, . . . 
safety program, and as a partial basis for time studies in connection with 
wage-incentive plans" (1, p. 220-221). This summary statement is per- 
fectly true. Likewise, numerous similar possibilities of fine inter-relation- 
ships exist between other functions than just job analysis; but to draw 
lines for all of these direct and indirect connections would complicate the 
diagram to the point of uselessness. Consequently, only the more ob- 
vious and direct relationships are represented by connecting lines. 

Upon first inspection, it may appear that many of the relationships 
mentioned by Pigors and Myers above are not indicated here. This 
chart, however, shows that job analysis is related to promotion and trans- 
fer; but the connection is shown as going by way of job evaluation rather 
than directly. In the same way, job analysis is related to selection 
through job specifications. By this method, many relationships can be 
traced which might otherwise seem to the reader to have been omitted 
from the diagram, 

The selected functions have been taken from Yoder (5) and Pigors 
and Myers (1), but they can be found, at least in part, in almost any text- 
book on personnel administration.! The diagram is not designed to give 
new facts, but merely to show, graphically, the relationships that do exist 
and which have been described by authors in this field. To take just à 
few instances: Line No. 3 of the diagram (connecting Selection and Place- 
ment with Training Program) is explained by Yoder, “A second prin- 
ciple (in establishing any effective training program) is involved in the 
selection of those who are to be trained” (5, p. 242). Lines Мо. 2 and 
No. 6 are spoken of by Shartle when, in discussing job specifications, he 
says, “Here the items from the job analysis report have been edited for 
use in the company employment office” (3, p. 46). Line No. 7 is included 
so as to show that, “Determination of training needs . . . can be effect- 
ively accomplished only in terms of job analysis and records maintained 
by the personnel division" (5, p. 219). Pigors and Myers verbalize Lines 
No. 10 and No. 24 as follows: “Before attempting to evaluate jobs, how- 
ever, it is necessary to know what a worker does on each specific job. . - · 


1 Readers familiar with elaborate charts of personnel functions, such as (Figures 8 
and 34) in Scott, Clothier et al. (2, pp. 32 and 138), may see a decided similarity. Their 
diagram, however, is tied in more with the traditional organization type of chart. 


Inter-Relationships of Selected Personnel Functions 149 


This is job description and analysis . . ." (1, p. 220); “With a well-ad- 
ministered employee-rating plan, management is in a better position to 
develop a sound promotion policy" (1, p. 174). These quotations are 
but а few examples from the literature which imply or express the inter- 
relationships pictured by the lines in Figure 1. 

The solid lines indicate that one function is basic to or useful in the 
administration of another. The arrows point to the procedures that are 
relatively dependent on others. When a reciprocal connection exists, the 
arrows point in both directions. Dotted lines are used to show the more 
tenuous relationships. In cases where the over-all conditions in the or- 
ganization are affected by a certain procedure, or where a procedure could 
be related directly to most of the others (as in a Suggestion System), 
this broad relationship is summarized by an arrow to the are at the right 
which symbolizes production, efficiency, and general morale. 

Detailed explanation of each, line of relationship is not included here 
because they are so much more fully described in the pages of any ade- 
quate textbook on personnel administration. 


Received January, 1948. 
Early publication. 


References 


1. Pigors, P., and Myers, C. A. Personnel administration. New York: McGraw-Hill, 
1947. џ 

2. Scott, W. D., Clothier, R. C., Mathewson, S. B., and Spriegel, W. R. Personnel 
management (3rd ed.). New York: MeGraw-Hill, 1941. 

3. Shartle, C. L. Occupational information. New York: Prentice-Hall, 1946. 

4 Tiffin, J. Industrial psychology. New York: Prentice-Hall, 1947. 

5. Yoder, D. Personnel management and industrial relations. New York: Prentice- 
Hall, 1942, 


The Effect of Smoking on Tremor 


А. S. Edwards 
University of Georgia 


Many studies on the effects of smoking have been made but without 
as accurate measurement as is desired—at least in some of the experi- 
ments. Probably the best summary of results of smoking is that of 
Hull? Аз he says, the measurement of the effects of smoking on tremor 
has been unsatisfactory. It has continued to be unsatisfactory; and the 
following experiments have been made to give accurate results in a field 
of growing importance since smoking has become so common. Prelimi- 
nary trials showed that with the finger tromometer, the effects on finger 
tremor could be detected and measured immediately after the smoking of 
half a cigarette. 

Does smoking increase finger tremor? Is it true, as some students 
argue, that they should not be required to go through two or three hour 
examinations without being permitted to smoke? The following report 
gives at least a partial answer to these and other questions about smoking. 
They give some information on a matter generally not understood, 
namely, effects of smoking with inhaling as contrasted with smoking and 
not inhaling. The experiments include finger tremor and smoking опе- 
half a cigarette; taking eight puffs on a cigarette in one minute; smoking 
with and without inhaling; effect of elimination of smoking for two hours; 
smoking “denicotinized” cigarettes; smoking corn silk with and without 
inhaling; and breathing (but not smoking) in a smoke filled room. 

In contrast to the significant results to be reported below in connection 
with finger tremor, it may be mentioned that comparatively slight results 
have been found in our experiments upon the effect of smoking on body 
sway. Results of three experiments have already been reported? and 
another unpublished experiment has recently been done with 100 Ss, col- 
lege students between the ages of 18-30, including 50 smokers and 50 
non-smokers. The results corroborated our earlier findings. Only in 
exceptional cases were there increases in body sway; and statistical study 
of the cases showed no differences before and after smoking with an ac- 
ceptable critical ratio. This does not mean that there was no effect of 

1 Hull, C. L. The influence of tobacco smoking on mental and motor efficiency: 
Psychol. Monogr., 1924, 33, No. 3, Whole No. 150, 1-161. 
з Edwards, А. 5. The measurement of static ataxia. Amer. J. Psychol., 1942, 55 


181. 
150 


The Effect of Smoking on Tremor 151 


king on body sway; but, save for the exceptional cases, the effects 
во small that they stand in sharp contrast with the immediate and 
ficant results of smoking on finger tremor which are reported below. 


Experiments with the Tromometer 


— Apparatus. The author's finger tromometer* was used throughout 
series of experiments. This apparatus permits a tridimensional 
urement of finger movement, front-back, right-left, and up-down. 
sum of the three measurements in millimeters was used for statistical 

Procedure. Ss were asked to come into the examining room and make 
themselves as comfortable as possible. They were asked to sit in the 
ir which was placed in such a position that the extended arm pointed 
directly towards one unit of the apparatus and was at right angles to the 
Second unit, the front of the third unit being directly above the finger 

оор. S was told that the object of the experiment was to measure the 
nt of finger movement. The middle finger of the right hand was 
ed in the loop which was drawn fairly tight at the base of the finger 

il. The standard position for the arm was extended but with elbow 
just slightly bent so as to relieve undue strain. Ss were required to rest 
three to five minutes at least before measurements began in order to 
nate tremor that might have been caused by activities before coming 
0 the experimental room. During the measurements they were not 
permitted to talk. 

Instructions. S was seated and told to lean back and be comfortable 
With both feet on the floor, and not to talk. The standard instruction 
Was given: “Try to be as still as possible." 

Subjects. In all experiments, except where otherwise noted, there 
Were 100 college students, selected at random, aged 16 to 35. 


* Experiment 1. Effect of Smoking One-half a Cigarette 


Subjects. In this éxperiment there were 50 smokers and 50 non- 
Smokers, aged 16-35. There were 32 men and 68 women, 
- Each 8 was was measured first as a control, then permitted to smoke 
one-half a cigarette which was marked at the middle with a pencil mark. 
Then 8 was immediately measured for results of smoking on finger tremor. 
_ Results. Immediately a difference was found in finger tremor between 
the control and experimental measurements. For non-smokers finger 
emor rose after smoking from 31.2 mm. to 36.8 mm. on the average, an 
e of 18 per cent, but with a С. R. of only 0.5. For the smokers the 
mor increased from an average of 48 mm. to an average of 67 


*Edwards, А. S. The finger tromometer. Amer. J. Psychol., 1946, 59, 273-283... 


© 


152 А. S. Edwards 


mm., an increase of 39 per cent, with a C. R. of 2.7. This is twice as 
much increase for smokers as for non-smokers. Similar results were 
found when comparing men with women, and the increase for women was 
considerably more than the increase for men. As has been found before, 
men had more finger tremor than women with one exception, namely, 
among the smokers the women had slightly more finger tremor after 
smoking. 


Experiment 2. Eight Puffs on a Cigarette 


The above experiment controlled the amount of a cigarette which was 
consumed. This experiment determined the number of puffs in one 
minute. 

Subjects. "There were 108 Ss college students, aged 17-25, 28 non- 
smokers and 80 smokers. Of the non-smokers 14 were men and 14 
women; of the habitual smokers 40 were men and 40 were women. All 
were chosen at random, except for the effort to get both non-smokers and 
habitual smokers, and an equal number of men and women. 

Results. Again the non-smokers showed less difference, the control 
series before smoking giving a mean of 26.7 and the mean after smoking 
444 mm., the C. R. being only 1.96. There was large variability in- 
dicated both by the SD and the fact that there was much less difference 
between medians than between means. 

With the smokers, however, all of whom probably inhaled, the con- 
trol series gave an average of 35.5 mm., the experimental series a mean 
of 61.5 mm., an increase of 84 per cent, and there was а C. R. between 
the means of 5.8. The medians also showed a similar difference, 24.0 
before and 51.5 after smoking. For the habitual smokers, the 40 men’s 
averages showed more tremor than those for the 40 women both before 
and after smoking, although the increase after smoking was for the men 
78 per cent, and for the women 82 per cent. All differences were statis- 
tically significant. For 11 Ss non-smokers who inhaled there was an in- 
crease in tremor after smoking and inhaling from 32 to 73.6 mm., an in- 
crease of 129 per cent. For 7 of these who had the greatest increase the 
average before smoking was 28.9 mm., and after smoking 107.6 mm, 
an increase in tremor of 272.3 per cent. This may be contrasted with the 
non-smokers who did not inhale who showed an increase of only 9.9 per 
cent after smoking. 

A problem was raised as to why smokers were affected more by 
smoking than were non-smokers. It was possible that increased finger 
tremor among the smokers was due to a cumulative effect; it was also 
possible that the differences were due primarily to inhaling. All habitual 
smokers, or nearly all, inhale. Non-smokers generally do not inhale and 


The Effect of Smoking on Tremor 153 


when they attempt to for the first time during experimental series, they 
are likely to have a very disagreeable experience. 

These results plainly called for an experiment definitely set up for the 
study of smoking with and without inhaling. 


Experiment 3. Inhaling vs. Not Inhaling 


The following experiment was made for the purpose of comparison of 
the effect of smoking with and without inhaling. In this series there 
were 22 Ss who smoked and inhaled and 8 controls who smoked, but did 
not inhale. 

Subjects. Twenty-two college students, chosen at random, who in- 
haled and who were willing to be tested during a 226 hour period; 8 con- 
trols, college students who did not inhale and who were willing to be tested 
through a 214 hour period. All were selected at random with the excep- 
tion of these two conditions. Their ages were 17-25. The Ss were men 
with the exception of two women in the experimental and three women 
in the control group. , 

Results. The results of this experiment are reported for four of the 
periods of testing, averages, before smoking, after smoking, after no smok- 
ing for two hours, and after smoking again following the two hour period 
of nosmoking. The averages for the 8 controls who did no inhaling were 
as follows for these 4 periods: 30.8, 26.9, 27.8, 27.2. For those who in- 
haled, 22 Ss, the averages were 28.1, 51.1, 32.2, 53.9. . Here is a very 
definite and signifieant difference. For those who inhaled there was a 
great increase in finger tremor; for those who did not inhale, small and 
statistically insignificant differences appeared before and after smoking, 
and again after withdrawal before and after smoking. 

Individual Cases. With controls not inhaling there were insignificant 
changes. In sharp contrast to this the Ss studied through 10 to 60 min- 
utes after smoking and inhaling, measured at from 3 to 10 minute inter- 
vals, showed the following results for individual cases: Before smoking 
26; after smoking and inhaling 36, 46, 43, 61, 74.7. The 8 having the 
smallest finger tremor showed only 5 mm., before smoking, but after 
smoking half a cigarette went up to 49.0 mm. and reported that the doctor 
had told her she must stop smoking. Another 5 before smoking had a 
finger tremor of 35 mm.; immediately after smoking it was 44.7, three 
minutes later 58.6. Опе S had before smoking, 14.3 mm.; after smoking, 
atremor of 31, 49, 31, 33, and 31, with continued smoking of two cigarettes. 
Another 8, a college athlete in the pink of condition, had before smoking 
a finger tremor of 30.3. Following the smoking and inhaling his finger 
tremor was 61 and after three minutes, 74.7. 

Other Ss were tried before smoking, after smoking without inhaling, 


154 A. S. Edwards 


and then smoking with inhaling. Examples of results follow: 44, 20, 63; 
19, 18, 57; 24, 23, 44. 

The trials in this series include not only smoking cigarettes, but 
smoking cigar and pipe. Results for smoking of pipe for the control who 
did not inhale are as follows: control 24.3, 22.6, 23.3, 18.3, 22. 5 who 
smoked and inhaled pipe smoke showed before smoking 24.0, after, 59.7, 
61.7, 89.0, 84.3, 77, and 59. Another pipe smoker who inhaled began 
with a tremor before smoking of 34.5. After smoking pipe and inhaling 
his tremor was 55, 60, 45.3 at 10 minute intervals. Another pipe smoker 
who inhaled started before smoking with a finger termor of 30; after 
smoking and inhaling the tremor was 72, 64, 63. 

The following example may be given for cigar smoking with and with- 
out inhaling. The control who did not inhale had before smoking an 
average of 13.6 and after smoking 13.6, 12 and 10. The S who inhaled 
cigar smoke had before smoking an average of 9.7 and after smoking 32.3, 
34.6, and 26. 

Both the reports of the group with group controls, and the individual 
cases with controls showed large and significant differences in finger 
tremor after inhaling. Мо significant differences were found following 
smoking without inhaling. 


Experiment 4. Effects of Withdrawal of Smoking 


Another experiment was made with 100 Ss, to check the results of the 
foregoing to discover effects of withdrawal and especially to determine 
the statistical significance of results. 

Subjects. College students, selected at random, 50 men and 50 
women, all of whom were habitual smokers, aged 17 to 33. These were 
chosen upon their willingness to go without smoking for a period of two 
hours. The time of withdrawal of smoking was actually between 2 and 
215 hours. They were not watched during the 2 to 214 hours, but most of 
them were students in the Department of Psychology, interested in the 
experiment, and their word was accepted that they had abstained for 
two hours. 

Results. For all Ss the following results appeared in the following 
order: Before smoking, after smoking, after abstaining from smoking for 
at least two hours and after smoking again after the withdrawal period: 
45.4, 58.0, 40.9, 58.6. All C. R.’s were significant, the C. R.’s betwee? 
the averages before and after withdrawal being 4.44. It is thus apparent 
that for 100 unselected smokers a period of two or more hours withdraw! 
gives a significant decrease in finger tremor. For the women the те 
duction in finger tremor was somewhat greater than for the men. In! 
cases the curve shows a rise after smoking before the withdrawal, а 818° 


E 


The Effect of Smoking on Tremor 155 


n t decrease following the withdrawal period, and a significant in- 
crease again after smoking. From this experiment it appears clear that 
withdrawal on the whole reduces finger tremor. The comments of the 
students indicated that many of them felt as well or better than usual 
and that they were somewhat surprised at the results. 

А brief summary of some important points from these experiments 
is shown in Table 1 and 2. 


'Table 1 


“Results in mm. of the Comparison of Non-Smokers and Smokers Before and After 
Smoking One-half a Cigarette; and Before and After Taking 
8 Puffs on a Cigarette in One Minute 


Before After* Increase 


4 cigarette: 50 non-smokers 31.2 to 36.8 1895 C.R. = 0.5 
4 50 smokers 48.0 to 670 39% C.R. = 2.7 
“ 
— 8 puffs: 28 non-smokers 26.7 to 444 65% C. В. = 1.96 
80 smokers 35.5 to 615 84% C. R. = 5,8 
EN 1 0 5l ССАН 
Table 2 


Effect (in mm) of Withdrawal of Smoking * 
Note: Results are arithmetical means before and after smoking; and after 2 hours 
Without smoking; and immediately after smoking again following the withdrawal. 
Withdrawal Smoking 
М! Before After 2 Hours Again 
EE _ 0 1 1L == УУ Se а шы S 
— [8 controls, no inhaling 30.8 26.9 27.8 27.2 
22 smokers, inhaling 28.1 51.1 32.2 53.9 


"ME 100 smokers, inhaling 45.4 58.0 40.9 58.6 


BE  . ~ — LL сыс. 
| * AII C. R.'s for the 22 Inhalers and the 100 Inhalers were significant, the C. R.’s 
М before and after withdrawal being more than 4. 


M 

. The reports of the Ss after withdrawal did not indicate any increase 
in nervousness and many of them said that evidently they simply thought 
they needed a smoke, but actually felt as well or better after the two-hour 
Period of no smoking. Some of them expressed surprise at the results 
Which were shown them. So far as our results go there is at least no 
Suggestion that students cannot go through a 2 to 23⁄4 hour period of 
examination without smoking. There are, of course, other factors to 
consider; the habit of smoking and the urge to smoke may be disturbing 
_ When one finds oneself in a situation with a strong urge to smoke, but not 
_ Permitted to do so. The question, however, may be raised as to whether 


s 


156 A. S. Edwards 


such a person is not smoking excessively, and needs to reduce his smoking 
rather than to be permitted to indulge at any and all times. Хо evidence 
was found, but perhaps in too limited a number of cases, to show that 
smokers who do not inhale were inconvenienced by not being permitted 
to smoke for from two to 214 hours. The two last series of experiments 
suggest that the greatest finger tremor, or for that matter, practically all 
increase of finger tremor, is found in those cases who smoke and inhale, 
and that the withdrawal of smoking from 2 to 216 hours does not increase 
finger tremor, except in a few cases and that very little. From the statis- 
tical results and the comments it seems to appear that withdrawal is 
beneficial. 


Experiment 6. Smoking “Denicotinized” Cigarettes 


Our results so far have been with smoking that involved nicotine; the 
two following experiments were made for the purpose of comparing these 
results with smoking which did not involve nicotine. One experiment 
was with so-called “denicotinized” cigarettes. The extent to which the 
nicotine has been removed is not known. The second experiment was 
with corn silk in which there was not any nicotine. 

With the same apparatus and standard procedure a series of experi- 
ments was made with Ss who had already been used in the experiments 
involving nicotine. With 10 Ss there were no distinguishable results 
found when tremor following the smoking of nationally advertised denico- 
tinized cigarettes was compared with our results reported in other ex- 
periments. Since the results definitely duplicated those already found 
with standard cigarettes, the experiment was discontinued. 


Experiment 5. Smoking Corn Silk 


In order to be sure that there was no nicotine present dried corn silk 
was obtained from the University farm and was smoked by a number of 
our former Ss. 

Subjects. Four college students, one woman and three men, selected 
because of their interest in the experiment. It was found necessary io 
smoke the corn silk in pipes since it appeared to be very difficult to make 
cigarettes. The four students above mentioned were experimental 85 
who inhaled, but the controls did not inhale. 

Results. Very clearly it appeared that there were no significant 
changes in tremor while smoking corn silk, either in the experimental 07 
control 8s. Several pipefuls were smoked, and when no results appeared, 
the experiment was finished with the smoking of a standard brand of to- 
bacco cigarette. Immediately the finger tremor increased as is shown Il 

our other experiments, with Ss who inhaled. The smoking of the corn 


The Effect of Smoking on Tremor 157 


silk was continued nearly an hour in all cases, and yet no differences in 
finger tremor were apparent. This was in sharp contrast to the im- 
mediate rise when the Ss who inhaled finished the experiment with the 
standard brand cigarette and so-called denicotinized cigarettes. 


Experiments 6 and 7. Effect of Breathing Cigarette Smoke 


One further question was raised, namely, to what extent would breath- 
ing cigarette smoke in a smoke filled room affect finger tremor. It is re- 
lated to the question of the effect of breathing in busses, trains or rooms 
where people are smoking. 

Two advanced students attempted, under the direction of the writer, 
to find the answer to this question. Two rooms were required for this 
experiment: one thoroughly ventilated and without any tobacco smoke 
for the control measurements; the other, a room of approximately the 
size of a fairly large bus, except that the ceiling was somewhat higher. 
The control room was used before the 8 was taken into the smoke filled 
room. After control measurements were taken, S was taken into the 
smoke filled room and measured at the end of 3, 6, and 9 minutes after 
breathing the air of this room. Ss did not smoke throughout the ex- 
periment. 

There was no measurement of the actual amount of smoke in the 
experimental room but in the two experiments there was more smoke than 
is ordinarily found in so-called smoke filled rooms. In one experiment 
the Es smoked and had cigarettes burning without smoking so that there 
was definitely as much or more smoke than is found in the ordinary 
smoke-filled room. In the other experiment, the room was filled with 
cigarette smoke so that Ss eyes were affected and many Ss indicated dis- 
tinet discomfort. The second condition was extreme and much worse 
than would be found in any smoking room, bus or train. 

Apparatus. "The writer's finger tromometer was used. 

Subjects. "There were 80 Ss, 40 men and 40 women in the two ex- 
periments; half of the men and half of the women were used in each ex- 
periment. 

Procedure. Standard procedure was used in both experiments. First 
the Ss were required to rest from 3 to 5 minutes and the control measure- 
ments were made. They were then taken into the smoke-filled room and 
Measured at intervals of 3, 6, and 9 minutes. 

Results. Although disagreeable feelings were reported in both ex- 
Periments, especially the one with the greater amount of smoke, no sig- 


„nificant results in finger tremor were found in either experiment. For 


all means and medians the differences were not greater than 4 mm. This 
15 quite insignificant. So far as these experiments go, therefore, Ss who 


€ 


158 А. 8. Ейшатаз 


do not smoke during the period of the experiment, but who breathed air 
abnormally filled with cigarette smoke, show no significant increases in 
finger tremor. 


Conclusions 


1. Our experiments seem to show that smoking without inhaling has a 
small, or negligible, effect upon finger tremor. ] 
2. In contrast to this the smoking of standard tobacco cigarettes with 
inhaling has been followed immediately, even during the smoking of the 
first cigarette, by a large and significant rise in finger tremor. | 

3. Withdrawal of smoking for two to two and one-half hours has 
shown large and significant decreases in finger tremor for Ss who inh 
Following the withdrawal period and after smoking again with inha 
finger tremor has increased greatly. 

4. No differences from the above conclusions can be found with 
use of so-called “denicotinized” cigarettes. When smoking these c 
rettes and inhaling, large and significant increase in finger tremor 
appeared, 

5. Experiments with cigars and pipe smoking have shown the 8 
results as the smoking of standard cigarettes, namely, no increase in finger 
tremor, or a negligible amount, with the Ss who did not inhale ; large 1 n- 
creases in finger tremor when Ss did inhale. 

6. The results are very different with the smoking of dried corn 
No increase of finger tremor, or very little, was found with the smokin 
of corn silk which was continued for one hour, even though the Ss inhalec 
The results of the experimental and control groups were practically 
same. There was no increase of finger tremor for either. 

7. Breathing cigarette smoke in amounts equal to and greater 
that commonly found in smoke filled rooms, busses and trains, does 
appear to be followed by any increase in finger tremor. 

8. So far as our experiments go, it appears that increase in fin 
tremor is related especially to the use of nicotine and inhaling. 
Received October 6, 1947. 


Factors їп the Design of Clock Dials Which Affect Speed and 
Accuracy of Reading in the 2400-Hour Time System 


Walter F. Grether 
Aero Medical Laboratory, Wright Field, Ohio 


People commonly experience difficulty in using the 2400-hour time 
system which has become standard in military practice. When time is 
read from а 12-hour dial, it is necessary to add 1200 to all readings after 
12:00 A.M. The mental arithmetic thus required introduces an oppor- 
tunity for error, and also some delay in obtaining the desired figure. On 
the other hand, 24-hour dials designed to give readings directly in 2400- 
hour time are, at first glance, quite confusing to persons who have spent 
their entire life reading time from 12-hour clocks. In the 24-hour dial, 
only one of the hourly positions can appear in its conventional location. 
In addition, an interval of one hour on the hour scale corresponds to 214 
instead of 5 minutes on the minute scale. One of the major purposes of 
the present experiment was to find out whether the 12-hour or 24-hour 
dial can be read more easily when readings are required in 2400-hour time. 
A further purpose was to evaluate a number of the possible factors in the 
design of either type of dial which might influence speed and accuracy 
with which readings are obtained. 

The clock dial designs used in the experiment were selected in order 
to make possible a comparison of the following variables: 


1, A 12-hour vs. a 24-hour dial. 

2. Use of numerals vs. no numerals on the minute scale. 

3. Use of 1-minute vs. 5-minute graduations on the minute scale. 

4. Use of numerals at all hourly positions vs. replacement, of some 
numerals with mere reference marks. 

5. Addition of a 13 to 24-hour scale to the 12-hour dial vs. no such 
scale. 

6. Placement of the 24-hour position at the top vs. the bottom of а 
24-hour dial. 

7. Placement of the 60-minute position at the top vs. the bottom of 
а 24-hour dial. 


Experimental Procedure 


For the purposes of this experiment, 11 different designs of clock dials 
Were prepared. A sample of each of these designs, considerably reduced 
159 


160 Walter Ё. Grether 


in size, is presented in Figure 1. Five of these clocks, types A through E, 
are variations of the 12-hour clock. The remaining six are variations of 
the 24-hour clock. Mock-ups of the 11 dials were prepared with movable 
hands. These were then photographed with the hands in different posi- 
tions to make up the actual items of a printed test. This test was made 
up in two parts. In Part I there were 10 reproductions of each clock 
face. The different dial designs were intermingled in a predetermined 
irregular sequence so that the subject was required to change from one 
dial to another as he worked on the successive items of the test. A time - 
limit was used for the entire 110 intermingled items, and thus no speed 

data could be obtained for any individual dial design. Part II of the 

test was prepared with 10 reproductions of each dial presented succes- 

sively, thus making possible the use of a time limit for each of the 11 de- 

signs presented. In Part II, therefore, both accuracy and speed data 
could be obtained. s 

In preparing the clock.reading test precautions were taken to equalize 
all factors which might contribute to reading difficulty except for those 
factors which were being studied. All dials were 2.25 inches in outer ' 
diameter. All numerals and graduation marks of comparable meaning 
were of the same dimensions on all dials and were sufficiently large to 
avoid any problems of visibility. Most of the dials to be compared 
directly with one another differed in only one characteristic, so that any 
difference in speed and accuracy of reading could be attributed to this 
one difference. As shown in Figure 1, an A.M. or P.M. beside each clock 
indicated to the subject whether the time shown was before or after 
12:00 noon. 

Additional precautions were taken to equalize the inherent difficulties. 
of the time settings on the different clock designs. It was assumed, for 
example, that A.M. readings would be less difficult than P.M. readings, 
and that readings would be easier when the minute hand is on a five | 
minute graduation mark than when it is on an intermediate position. 
For this reason, all clock dial designs were equalized with respect to the - 
number of A.M. and P.M. readings, number of readings at 5-minute posi- _ 
tions, average magnitude of minute readings, and number of hour readings 
at major positions (i.e., 3, 9, 12, 15, etc.) In determining the sequence. 
in which the test items appeared in Part I of the test, precautions were 
taken to insure that there was no grouping of-items involving a particular - 
clock near the beginning or end of the test. The actual items in Part IL 
of the test used different time settings from those in Part I. Subjects. 
were instructed to read the clocks to an accuracy of one minute, and there 
were no settings of the minute hand between one-minute graduations. 

This test was administered to 62 rated military personnel (pilots, 
navigators, and bombardiers) at Wright Field and to 100 advanced 


4 
v 


161 


Speed and Accuracy of Reading Clock Dials 


“mss omm (2104-0090) геш oq) ш Эшрвәг 10/spérp xpop jo Арпав oy} ur posn suJrsop үв}пәшцәйхд "I "OLI 
LJ з 


162 Walter F. Grether 


mathematics students in a Dayton high school. All subjects took Part I 
of the test prior to Part II. In taking PartII of the test, however, ap- 
proximately one-half of the subjects began at the front of the test booklet. 
Тће remaining one-half of each group took Part II of the test in reverse 
order. That is, they first completed the 10 items for the last dial design 
in the booklet in the order in which they appeared, then those for the 
second from the last dial design in the booklet, etc. For the rated mili- 
tary personnel, a time limit of 15 minutes was used on Part I of the test 
and a time limit of 45 seconds on each section of Part II of the test. For 
the high school students, a 19-minute time limit was used for Part I and 
a 1-minute time limit for each section of Part II. 


Table 1 
Per cent Errors and Time per Reading for Eleven Experimental Clock Dials * 
Rated Military Personnel High School Students 
Х = 62 Х = 100 
Part I Part II Part I Part II 

Clock Percent Percent Sec. per Percent Percent Sec. per 
type errors errors reading errors errors reading 

A 7.2 6.4 5.28 12.2 11.8 7.52 

B 5.6 7.1 5.39 8.6 13.8 7.88 

c 19.0 19.1 5.61 274 23.5 8.55 

р 87 13.8 5.69 18.0 20.9 8.24 

Е 73 145 5.34 13.0 23.9 8.10 

F 74 8.0 4.93 13.3 15.6 6.90 

G 42 6.8 4.79 6.1 84 6.56 

H 10.8 173 5.40 15.3 22.8 7.95 

I 12.8 177 5.45 19.6 29.5 7.79 

J T 3.6 5.02 147 4.9 6.86 

K 42.8 14.7 5.64 35.9 14.2 7.82 


* Significance of differences. 


When the average error score for two clocks being compared is 5% 10% 20% 
The results can be considered significant, (1 per cent level of 
confidence) if the difference between clocks is equal to or 


greater than the following: 
For rated personnel 3.59, 47% 62% 
For high school students 334, 40% 54% 


"The results for time per clock reading can be considered sig- 
nificant (1 per cent level of confidence) if the differences 
between clocks are equal to or greater than the following: 

For rated personnel .20 sec. 

For high school students .81 sec. 


Speed and Accuracy of Reading Clock Dials 163 


Results 


A summary of the major results of this experiment is presented' in 
Table I, which shows the per cent errors (of one or more minutes) on each 
clock, for both parts of the test, and for both groups of subjects. The 
table also shows the time per clock reading in seconds for Part II of the 
test, for both groups of subjects. At the bottom of the table are shown 


CLOCK 


TYPE PERCENT ERRORS 

"m--— 

[= 

с pau——— —— 

dna ————9— 

=, 

= 

5 шш. 

н —— — 

1 
m— S 

o c — — —3 

€ m——— ДИ 
СОО оеш 

РЕВСЕМТ 


RATED MILITARY PERSONNEL EEEN N= 62 


WIGH SCHOOL STUDENTS p—— ЫЙЫ үз үгү, 
Fic. 2. Per cent errors in military time readings on eléven experimental 
clock dials (part Т). 


the estimated differences required for significance at the 1 per cent level. 
Differences in the results for any two clocks which are equal to or greater 
than the differences at the bottom of the table, may be assumed to be 
genuine differences and not the result of chance factors. 

5 The results presented in Table 1 are also presented in the form of bar 
diagrams in Figures 2, 3 and 4. It will be noted in Table I and Figures 
2, 3 and 4 that the data for high school students and rated personnel 
Present substantially the same overall picture. In general, also, the dif- 
ferences which appeared among the dials in Part I of the test reappear 
In Part II. "Thus, although many of the differences between dials on one 


164 Walter F. Grether 


part of the test and for one group of subjects are not significant, the fact 
that the differences are in the same direction in Part II and for the other 
group of subjects greatly increases the likelihood that the differences are 
significant. 

In general, accuracy of readings was somewhat lower in Part II of the 
test, even though successive items were similar, probably because the 


CLOCK 
TYPE PERCENT ERRORS 


AGS n 
хаома 


Џ 24 32 40 
PERCENT 

RATED MILITARY PERSONNEL SM М: 62 
HIGH SCHOOL STUDENTS CL нелоо 


Fia. 3, Per cent errors in military time readings on eleven experimental 
clock dials (part II). 


паш of individual sections motivated the subjects to work at greater 
speed. 

In Table 2, an analysis is presented of the various types of error made 
on the different clock dials in Part II of the test. Most of the errors 
were 1 minute, 5 minutes, 1 hour or 12 hours in magnitude. The fre 
quency of each of these types of error is shown for each dial for both the 
military personnel and the high school students. 

Table 3 presents a number of product-moment correlation coefficients 
which aid in an evaluation of the experimental method used in this }2 
vestigation, The correlations between speed and accuracy of individu, 


Speed and Accuracy of Reading Clock Dials 165 


are low but positive for both rated military personnel and high school 
students, although the correlation for military personnel is not Significant. 
This result indicates that the more rapid individuals tend also to be more 
accurate. The correlations between speed and accuracy for the different 
dials are high and positive for both groups of subjects, indicating that 
the dials which can be read with the greatest speed can also be read with 


CLOCK 
TYPE TIME PER CLOCK READING 


КД 
о 2 4 6 8 10 
SECONDS 
RATED MILITARY PERSONNEL  WENNENE N= 62 
HIGH SCHOOL STUDENTS (38200 


Ела. 4. Speed of military time readings on eleven experimental 
clock dials (part II). 


the greatest accuracy. The correlations between errors on Parts I and 
II of the test are positive but not significant for both groups of subjects. 
This would seem to indicate that the results depend to a considerable 
extent upon whether the individual dial designs are intermingled as in 
Part I or grouped as in Part П. The final correlations, between the re- 
sults for the two groups of subjects, are positive and very high, indicating 
that the relationships among the results for the different dials were quite 
independent of the experience of the subjects with military time, 


166 Walter F. Grether 


Table 2 


Frequency of Several Types of Error in Reading of Eleven Experimental 
Clock Dials (Part II of Test) 


Rated Military Personnel High School Students 
Х = 62 Х = 100 
Size of error Size of error 
Clock 1 5 30 1 12 All 1 5 30 1 12 АП 
bos min. min. min. hr. hr. other min. min. min. hr. hr. other 
Frequency Frequency 
A 6 8 0 4 1 16 19 22 210518. 7-19 7 9f 
B 4 3 0 2 9 18 10 4 2 20 34 35 
с 28 18 0 26 6 16 26 28 0 7 25 —99 
D 5 24 0 18 8 12 21 383 0 34 24 99 
E 9 21 0 --27. Ме 16 19 27 о 35 60 33 
F 8 12 0 12 0 12 19 25 13 37 0 20 
G 6 9 0 16 0 7 8 9 1 38 3 19 
H 26 14 1 11 1 35 35 20 1 59 1 56 
I 38 13 0: 38 0 27 79 25 2 69 1 348 
J 3 4 5 2 1 6 8 3 9 10 1 12 
K 8 1 36 7 6 9 14 1 45 16 4 30 
Table 3 


Correlations Among Several Variables in Experiment on Design of Clock Dials 


Correlation Significance 
Variables r Level 
Relation between speed and accuracy of individuals 
Per cent errors vs. number of items omitted (Part IT) 
N = 62 Rated military personnel +.05 Not sig. 
N = 100 High school students +.34 1% 
Relation between speed and accuracy for different dials : 
Per cent errors vs. seconds per reading (Part II) 
N = 11 clock dial designs 
For rated military personnel +.69 5% 
For high school students +.72 1% 
Relation between results on two parts of test 
Per cent errors on Part I vs. per cent errors on Part II 
N = 11 clock dial designs 
For rated military personnel 4-45 Not sig. 
For high school students +27 Not sig. 
Relation between results for two groups of subjects 
Rated military personnel vs. high school students 
N = 11 clock dial designs 


Per cent errors on Part I 4.94 196 
Per cent errors on Part II +.89 1% 
Seconds per reading on Part II +.91 1% 


Speed and Accuracy of Reading Clock Dials 167 


Evaluation of Results with Regard to Experimental Method 


1. Lack of control of either the time or speed variable. In this experiment 
speed and accuracy were allowed to vary independently. The positive 
correlations between speed and accuracy of individuals (Table 3) show 
that the individuals who achieved the greatest accuracy did not do so at 
the expense of increased time. More impotrant for this investigation, 
however, are the relations between speed and accuracy of the 11 different 
dials. The high correlations indicate that those dials which can be read 
most accurately are also read most quickly, and thus the same conclusions 
are reached whether speed or accuracy is taken as the criterion. Had 
time per test item been equalized for all dials, it is probable that the error 
differences among the several dials would have been accentuated but not 
changed in direction. It is believed that the procedure of allowing speed 
and accuracy to vary independently has value in that it provides two 
criteria for evaluating the design variables under investigation. 

2. Intermingling versus grouping of different dial designs. In Part I 
of the clock reading tests the various dial designs were intermingled in a 
random fashion so that the subject was unable to adjust to a particular 
design. In Part II each design was presented in a separately timed 
' section of the test booklet. The correlations between errors in Parts I 
and II (Table 3) for the different dial designs are positive but below the 
5% level of confidence. It is concluded from this that the difference in 
method did significantly affect the results. The greatest effect appeared 
in dial design K which had the minute scale rotated 180 degrees from its 
conventional location. In Part II of the test the subjects were able to 
adjust to this arrangement and avoid the high percentage of errors made 
in Part I. The test method in Part II probably simulates more closely 
the clock reading situation in real life, particularly for the aircraft pilot 
or navigator, since he repeatedly reads time from the same instrument. 
There is an additional argument in favor of Part II, namely, that it 
provided speed as well as accuracy data. In Part I of the test the time 
рег item could be neither controlled nor measured. It is concluded that 
this difference in method does affect the results, and that the method used 
їп Part II is to be preferred for practical reasons. 

3. Previous experience of subject group. In this experiment two quite 
different types of subjects were used: rated military personnel with con- 
siderable training and experience in the use of 2400-hour time, and high 
School students with little if any exposure to this time system. Although 
the high school students made more errors and required more time per 
test item, the last group of correlation coefficients in Table III indicates 
that the basic findings were virtually the same for either group of subjects. 


168 Walter F. Grether 


We can conclude, therefore, that in this experiment the nature of the 
previous experience of the subjects was an unimportant variable. 


Evaluation of Results with Regard to Clock Dial Design 


1. Twelve-hour vs. 24-hour dial. Comparison of the first 5 with the 
last 6 clocks in Table 1 shows that there was no major advantage in favor 
of either the 12- or 24-hour dial, although 24-hour dials, Types G and J, 
were superior to the two best 12-hour dials, Types А and B. "This was 
particularly true for speed of reading. The 24-hour clocks showed some- 
what more l-hour errors, probably because of the smaller spacing of the 
hour numerals. 

2. Numerals vs. no numerals on minute scale. The comparison of 
clocks A and B does not reveal any significant advantage to placing nu- 
merals on the minute scale of a 12-hour dial. In the case of the 24-hour 
dial, however, as indicated by comparison of clocks F and G, there ap- 
peared to be a definite advantage in favor of numerals on the minute 
scale. Dials without numerals on the minute scale showed a considerably 
higher proportion of 5-minute errors in Table 2. 

3. One-minute vs. 5-minute graduations on the minute scale. Com- 
parison of clocks A and С, and F and I indicates a significant difference | 
in favor of placing graduations at 1-minute intervals when readings аге 
required to an accuracy of one minute. Clocks C and I showed a high 
proportion of 1-minute errors. 

4. Numerals at all hourly positions vs. replacement of some numerals 
with mere reference marks. Comparison of clocks A and D, and clocks F 
and H indicates а loss in accuracy when numbers were omitted at some 
of the hourly divisions. 

5. Addition of а 18- to 24-hour scale on а 12-hour dial. Clock E, with 
the 13- to 24-hour scale added was inferior to clock A without such а scale. 

| 6. Placement of the 24-hour position at the top vs. the bottom ој a 24-hour 
dial. Clock G, with the 24-hour position at the top, was best in Part I of 
the test, whereas J, with this position at the bottom , was best in Part II of 
the test. This would suggest that in a situation where an individual 
can become accustomed to reading a particular clock, as in Part II of 
the test, there is some advantage to placing the 24-hour position at the 
bottom of the dial. 

7. Placement of the 60-minute position at the top vs. the bottom of the 
24-hour dial. The results for clock K show quite clearly that the uncon- 


ventional location of the 60-minute position at the bottom of the di 
caused a high percentage of errors. 


Speed and Accuracy of Reading Clock Dials 169 


Summary 


This experiment was carried out to study design factors which influ- 
ence the speed and accuracy of reading clocks in the military or 2400-hour 
time system. Eleven different designs of clock dials were used, of which 
five were variations of the 12-hour dial and six were variations of the 24- 
hour dial. Reproductions of these dials were presented in a printed test 
divided into two parts arranged so as to make possible a determination 
of both speed and accuracy of clock reading. This test was given to 62 
rated Army Air Forces officers and to 100 high school students. 

The following conclusions are drawn concerning clock dial design 
factors which influence speed and accuracy of readings in 2400-hour time: 


1. A 24-hour dial is slightly superior to a 12-hour dial. 

2. Dials with numerals on the minute scale are superior to dials with- 
out such numerals, particularly on 24-hour clocks. 

3. Lack of one-minute graduation marks reduces speed and accuracy 
(when readings to an accuracy of one minute are required). 

4. Lack of numerals at all hourly positions reduces speed and ac- 
curacy, particularly on the 24-hour type dial. 

5. A 12-hour dial with a 13- to 24-hour scale added i is not superior to 
a dial without this additional scale. 

6. Placement of the 24-hour position at the bottom of a 24-hour dial 
appears to be superior to placement at the top. 

7. Placement of the 60-minute position of a 24-hour dial in its con- 
ventional loeation at the top is superior to locating it at the bottom. 


Received August 11, 1947. 


The Effect of Instrument Dial Shape on Legibility * 


Robert B. Sleight ** 
Division of Education and Applied Psychology, Purdue University 


In spite of a wide diversity of instrument dial types in use today, obj 

tive evidence is lacking which designates one type of dial as more desirabl 
than another from the standpoint of legibility. In this study, compar 
sons were made of five dials of different shapes, all in common use # 
certain purposes. It was the aim of the study to determine the relativ 
legibility of these several differently shaped dials. 


Historical Background 


Most instrument dials in use today are of the conventional roum 

_ type with moving pointer; but many people interested in the problem 0! 
instrument dial design have seen the possible desirability of other type 
Greatest interest in instrument dial design, especially from the stand 
point of legibility, has been expressed by those concerned with aircral 
instruments, in the reading of which speed and accuracy are often 0 
vital importance. Beal (1, p. 440) comments on this point as follows 
“The fundamental fact is that the pilot must be able to read all fligh 
instruments quickly and accurately.” 1] 
Other references to dial shape have been made in the literature; fe 
instance, Stewart (21, p. xvii) mentions the use of edgewise dials in pl 
of circular dials in aircraft, principally as a means of saving valuable sp 
Eaton (5, p. 9) is of the opinion that further development of the verti 
or straight dial for use in aircraft might result in considerable conven 
to the pilot. Hibbard (9, p. 759) suggests, concerning the type of 
needed for certain purposes, that: *Research should be done in the d 
of an altimeter having on its face an open window in which altitudes 08 
be presented directly as figures." 


* This research was carried out under subcontract between the Purdue Reseal 
Foundation and The Johns Hopkins University. The subcontract is part of contri 
N5-ori-166, Task Order I, between the Special Devices Center, Office of Naval Reseat 
rab Johns Hopkins University. This paper is Report No. 166-1-33 under t 
contrai 

** The author wishes to thank Dr. J. A. Bromer, Director of the Instrument I 
Design Research at Purdue University, Prof. E. J. Asher, Dr. S. E. Wirt, Mr. E: 
Dudek, and Mr. J. G. Gleason, also of Purdue Урт who provided much аду 
and assistance in the performance of this study. 3 


170 


Effect of Instrument Dial Shape on Legibility 171 


One editorial (30) expresses a belief that it is “more convenient to the 
user of a dial for calibration to be equally spaced. . . . This may be 
accomplished by using an arc of wider radius than the circle which could 
be accommodated in the same space; or by mechanical translation of an 
arc reading into a straight band reading." 

Riggs (20) developed a counter-type indicator to replace the microm- 
eter scale on certain optical devices. This counter, in later tests, 
proved to be markedly superior to the standard scales in terms of reading 
errors made by the operators. Slightly more setting errors were made 
with the counter than with the conventional scale. 

Chapanis (4) found that when numerical information has to be read 
from a piece of equipment, a counter-type indicator is a more efficient 
method of presenting information to an operator than an annular dial. 
If the visual indicator must be used for setting information into the equip- 
ment, however, a counter is not as efficient as a dial. 

Additional confirmatory evidence of the same sort was obtained in 
studies by the Applied Psychology Panel, NDRC, which showed a superi- 
огњу of open window or counter-type indicators over micrometer scales 
when operators are again simply required to read the scales. 

The Great Britain Air Ministry (29), in 1941, recommended concern- 
ing dial design that “in future design there would be visual advantages 
in designing dials for night use to disclose only that part which needs to 
be read.” They suggest for this, “a moving dise behind an aperture 
with a fixed pointer.” 

It is not difficult to understand why few changes have been made 
from the common round type of dial. Besides the natural reluctance to 
vary from the type of dial which habit has established, there is the matter 
of engineering convenience. ‘This latter factor, to some extent at least, 
has been responsible for the prevalence of the round dial, because of the 

‚ Telative ease of obtaining circular movement of a center pivoted pointer. 
Lester (11, p. 80) points out another feature of the round (clock type) 
dial which makes it convenient from а design stand-point when he says, 
"Designers can wrap ten inches of scale around a three inch dial . . . Á 

It is evident that many factors may limit the usefulness of a certain 
shape dial; for instance, McFarland (17, p. 429) observes that “the length 
of scales may often preclude the use of linear dials and necessitate the 
Use of circular ones... .” 

3 Legibility is naturally only one of the factors which must be con- 
sidered in determining the applicability of a specific type of dial for a 
certain purpose. Illustrating this point is a recommendation made by 
McFarland (17, p. 429) that “in instruments, as in controls, it would be 
desirable wherever possible to have the action of the indicator correspond 


172 Robert B. Sleight 


to the effect that is being produced on tlie plane or the unit." For ex- 
ample, the use of a vertical dial on altimeters, the pointer moving up- 
ward as the plane ascends, might be an advantage over a dial which lacks 
this symbolic feature. 

The few experimental studies which have been made of dial legibility 
are of a recent nature. During World War II, Loucks (13, 14, 15, 16) 
directed research in which several features of the currently used aircraft 
instruments were studied from the standpoint of legibility. The major 
conclusions derived from these extensive comparisons of dials were as 
follows: 1. The accuracy with whieh comparable dials were read de- 
creased as the number of scale divisions increased. 2. The numbering of 
subdivisions tended to decrease the accuracy with which an instrument 
could be read during an exposure of 0.75 second. 3. A reduction in the 
width of a pointer that had partly obscured the smaller numbers and 
scale divisions did not improve accuracy. 4. An increase in the height 
and thickness of letters did not necessarily improve legibility. 5. The 
starting point of a scale had no significant influence on its legibility. 6. 
Mid-division lines that change in value from one part of the scale to 
another proved confusing and gave rise to increased errors. 7. Luminous 
tipped pointers were decidedly inferior to standard hands, but a narrow 
luminous strip along the length of the pointer was satisfactory. On the 
whole, the more simply a dial is designed within the limits of the desired 
accuracy, the less difficulty there will be in reading it. It was found by 
Loucks that even with modifieation the majority of the dials available 
for his studies had а low degree of legibility in terms of the percentage 
of error during exposures of 0.75 and 1.5 seconds. 

In a study conducted by Vernon (22) on dial and scale reading it was 
reported that: ^If an individual is required to read a number of dials 
rapidly, his speed is unlikely to be as great or as regular when the dials are 
differently graduated as when they have the same scale graduations.” 

A study planned and directed by Grether (8), on speed and accuracy 
of dial reading in relation to the dial diameter and spacing of scale divi- 
sions, was recently reported. The main conclusions of the study were 
аз follows: 1. "The accuracy with which the position of a pointer can be 
read in terms of degrees on a circular scale increases, within limits, a8 9 
function of dial diameter and frequency (or proximity) of scale divisions: 

2" 2. “Speed of dial reading is not systematically related to either 
dial diameter or angular spacing of scale divisions.” 

The influence of convention and habit, as has been intimated before, 
undoubtedly is a compelling force in creating a preference for a particulat 
type of dial. This can, however, perhaps be eliminated by proper vali- 
dation of experimental findings and training in the use of dials which are 
desirable from the standpoint of optimum design characteristics. 


Effect of Instrument Dial Shape on Legibility 173 


Most commercial companies have been guided in their instrument 
dial design by standards developed through conference methods and 
eustomers' wishes and seldom by experimentally determined criteria. 
(Twenty-two instrument and dial manufacturing companies were con- 
tacted by the author concerning instrument dial design practice.) The 
military as well as civilian enterprises have come to realize that with the 
advent of faster, more complicated machines, there has been emphasized 
the problem of precise control. Advancement of instrumentation is the 
means of accomplishing this needed control. This instrumentation has 
often developed without due regard for the capabilities of the human who 
is to employ it (26). That this tendency is being replaced by a concern 
for the human factor is illustrated in the following quotation: “New in- 
struments will have new faces, easier to read and interpret. . . ." (28) 

To illustrate the variety of uses to which instruments may be put, the 
following classification of instruments according to function, as given by 
Behar (2), is included here: 


1. Balancing 7. Detecting 13. Registering 
2. Checking 8. Indicating 14. Sampling 
3. Controlling 9. Integrating 15. Signaling 
4. Counting 10. Measuring 16. Testing 

5. Curve-drawing 11. Metering 17. Timing 

6. Cycling 12. Recording 18. Totalizing 


It can be seen that instruments and, hence, instrument dials, serve 
varied purposes. It is important to note at the outset that this study 
investigates only one feature of the over-all legibility aspect of an in- 
strument dial. Besides the dial shape, there are many dial charaeter- 
istics which are of importance in determining dial legibility. These in- 
clude such dial features as size and style of numerals; size and style of 
graduation marks, number and spacing of marks; shape, size, and direc- 
tion of movement of a pointer; color and contrast of areas of the dial. 
Although some of these features have been examined to a limited degree, 
further study is needed before optimum dial specifications can be objec- 
tively stated. 


Definition of Terms 


Legibility as the term is used in this report carries the connotation of 
recognizability and in addition, meaningfulness. Paterson and Tinker (18) 
use legibility in referring to meaningful reading of printed matter. Most 
studies of legibility in the past have been concerned with various aspects of 
printed and written material, in connection with the meaningful reading of 
words, letters, ete. а 

Legibility of instrument dials, then, is essentially the degree to which it ia 
Possible to gain meaningful information from given indications. 


174 Robert B. Sleight 


Legibility should be distinguished from two other closely related concepts, 
namely, acuity and interpretability. Acuity is most often defined as the 
ability to distinguish fine detail. This perception is of a visual threshold 
nature and is not necessarily meaningful. Interpretability is а more complex 
concept in that it indicates a condition of preparedness for a complex response 
as ui as recognition and meaningfulness. 


Criterion 
The criterion of dial merit chosen for the present study was that of 
legibility, as measured by the comparative accuracy of readings made 
from various types of dials. Other criteria might be well worth considera- 
tion, such as measures of work decrement in activities involving dial 
reading. No other criterion than that of legibility, however, has been 
included in this study. That this criterion is probably a desirable one is 


indicated by Kelly’s comment (10) that: “Criterion specification for de- 
sign of an improved instrument panel is optimum legibility.” 


Experimental Procedure __ 
Choice of Experimental Method 


Some of the experimental techniques used in the studies on legibility 
of printed matter are applicable to experimentation in dial design. Five 
common methods of measuring legibility have been summarized by Burtt 
and Basch (3) as follows (these are given as referred to in studies of legi- 
bility of type faces): “ (1) maximum distance at which type may be read; 
(2) time taken to read a passage; (3) number of letters read in a tachis- 
toscopic presentation, or minimum exposure at which they can be read; 
(4) minimum illumination under which they ean be seen; and (5) extent 
to which letters can be thrown out of focus and still be identified.” 

The method used in this study was essentially a form of the third 


method noted above, but in this case, dial reading accuracy in a tachis- 
toscopic presentation. 


Subjects 


The subjects used in this study comprised a group of 60 male uni- 
versity students, principally elementary psychology students. In addi- 
tion, five subjects were used in a preliminary experiment and their re- 
вропвез are considered in the discussion. 

The only selection factor operating in the choice of these subjects 
was а ри Screening test designed to determine that the subjects had 

normal" visual acuity (corrected or uncorrected) for the testing dis- 


tance used in the experiment. The acuity target used as the Snellen 
Rating Reading Card. уш T 


Effect of Instrument Dial Shape on Legibility 175 


Apparatus 


Tachistoscope. The apparatus used in this study was a form of mirror 
tachistoscope of original design. This tachistoscope operates on the principle 
that glass is transparent when an illuminated object lies behind it, while the 
same glass functions as a mirror when an illuminated object lies in front. 

An explanation of the tachistoscope will be facilitated b reference to the 
schematic plan shown in Figure 1 and the cutaway view in Figure2. (Letter 
notations that follow refer to those shown in Figure 1.) 


Fic. 1. Schematic plan of the tachistoscope. 


.The exposure apparatus consisted of а large black interior-painted box 
Which had а partially reflecting mirror M mounted inside at an angle of 45 
degrees to the observer's line of sight. At the front of the box was an opening 
through which the observer could view the stimulus material 8 by looking 
through the transparent mirror. The stimulus material was inserted in an 
opening at the rear of the box. The center of the front and rear openings 
were at eye level for the seated observer. . 

7 е pre-exposure area was obtained by means of light from tubular lam 
ìn the top chamber. This light illuminated a sheet of opal glass Сб. The 
Observer from his position at O could see the image of this lighted area reflected 
from the surface of the mirror M. This lighted area served to maintain at a 
constant level the observer's light adaptation. 

Light from tubular lamps in the lower compartment passed through a 
Sheet of opal glass G; and was reflected from the bottom surface of the mirror 
on to the stimulus material. Baffles of thin sheet metal set over the opal glass 

2 Served to prevent direct illumination of the stimulus material which would 
ave produced uneven brightness. у o. Б 3 
hen the bottom lights were “оп” and the top lights “off ” it was possible 
for the observer to look through the transparent mirror and view the stimulus 
material 8; when these lichts were reversed the reflection of the opal glass Gi 
Was visible and the stimulus material was obscured. 


176 Robert B. Sleight 


In this apparatus the pre-exposure field G; and the exposure field S were 
constructed in such a manner that the areas as viewed by the observer were 
equal. Also the intensity of illumination of the pre-exposure lights and the 
exposure lights was adjusted by means of perforated shields so that these fields 
had even and equal brightness. 

The focal distance for the observer's eye was maintained constant by having 
the distance from the observer's eye to the stimulus material equal to the 
distance from the observer's eye to the top surface of the mirror horizontally, 
plus the distance vertically from the mirror to the lighted pre-exposure area Gi. 


Fic. 2. Cut-away view of the tachistoscope. 


The observer was seated in ап adj ir with hi i 
У 1 justable chair with his eyes level with the 
poaae of the front o ening. The experimenter was stationed at the rear of 
t e where he could adjust the dia pointers to desired settings. 
e brightness of the pre-exposure and exposure fields for this experiment 
Md i. Боен level of 4.10 foot-lamberts as uel using 9 
inometer. i б 
dial ark оа е contrast between the dial background and the 
imi echanism. The timing mechanism used to regula 

2 2 € te the exposure 
time i the stimulus material was of the electronic HT "This ing made 

possible control through a range of times from one-sixtieth to one second wit 
а continuously adjustable control. The exposure times used in the preliminary 
experiment, were as follows: 0.28, 0.20, 0.17, 0.14, and 0.12 second. For the 
plieity of the dial s of Med seconds was chosen because, due to the un 
reading problem, a brief y. ide 

sufficient errors to differentiate amon; the dia E S ОАЕ T PRAT 
The timer was activated by a push button at the discretion of the experi- 
AU Gonin m elt A to control а direct current relay 
ап теаК” {һе с 1 psi 

and es OR S и e current flowing to the pre-exposu 


Effect of Instrument Dial Shape on Legibility 177 


Stimulus Materials 


The stimulus materials consisted of five dial types, as shown in Figures 
3 to 7, inclusive. (Representative settings are shown in these Figures 
as the subject, participating in the experiment, viewed the instrument 
dials. The subject, however, viewed them singly.) The features of these - 
dials other than the dial form or shape were intentionally held as constant 
as possible as an aid in decreasing the number of dependent variables. 


IO 
9 
8 
T 
6 FIG. 4. SEMI-CIRCULAR 
5 
4 
3 \ 
А «ls 901 
| FIG. 5. OPEN- 8 2 
WINDOW 7 3 
о g 5 4 
FIG.3. VERTICAL 
FIG. 6. ROUND 


OQ r.2:-31:4 05] ОША BIO TO 


FIG.T. HORIZONTAL 
Fias. 3-7. Photographs of dials used in the experiment 
(approximately $$ actual size). 
Among the features of the dials which were held constant were the 


following: numeral dimensions and form, size of graduations, distance 
between graduations, position of the numerals and dimensions of the 


pointers, 


178 Robert B. Sleight 


The dials were made up on matte finished white drawing paper using black 
India ink, the No. 4 pen, and the 3506 template of the standard LeRoy Letter- 
ing Outfit. The numerals used closely approximated the Army-Navy numeral 

cifications for instrument dials. After these drawings were made they were 
2 securely on the surface of twelve inch square plywood panels and pointers 
were attached in а manner which permitted adjustment by the experimenter 
* through reference to an indicator on the rear of the panel. The diameter of 
the round dial used was about two and one-half inches. The circumference 
of this dial was about eight inches. The dimensions of the other dial types 
were derived from those of this round dial. 


Dial Settings. The pointer settings actually used in the experiment 
were determined by using the settings possible on the numerals or mid- 
way between two numerals, i.e., on the major graduations or on the minor 
graduations. 

Dials Chosen for Study. A brief survey of dials in common usage was 
made by checking the available literature. Most thoroughly checked 
in this survey were the dials illustrated in several catalogues of instru- 
ment and dial manufacturing companies. The more common different 
types noted were those used in this study. 


Experimental Design 


The dials were presented to the subjects in a systematically rotated 
fashion; e.g., the first subject read the horizontal dial first, then the ver- 
tical, next the round, etc.; the second subject read the vertical dial first, 
then the round, next the open-window, etc. By thus rotating the order 
of presentation of the dials with succeeding subjects, there was accom- 


plished an effective and yet simple method of dealing with the problem of 
practice and fatigue effect. 


Administration Procedure 


The procedure for the administration of th i i t b 
the experimenter was as follows: ЖОГ н carried сей 


1. S was seated in an adjustable chai iewini 
REUS ale rein c ] chair во that the center of the viewing 


2. E placed the acuity card (Snellen Rating Reading) in the panel holder 


the transparent mirror, illumi i 
the first dial which you will read." оу е геаг ligh 


paratus are so arranged that they vill 
е а brief view of the dial. I will make the 


midway between two numbers, You will give the reading shown by the 


Effect of Instrument Dial Shape on Legibility 179 


6. S was advised that he would be given some practice readings. S was 
given random settings until he answered two successive readings correctly. 

7. Before activating the timer, S was prepared by E saying, "Ready, now!” 
Then E pushed the button and activated the timer to expose the dial. 

8. E made the settings as listed on the test sheet. 

> 9. E recorded errors made by S opposite the actual setting on the data 
sheet. 

10. For successive dials.the instructions to S were abbreviated. 8 was 
shown the next dial and told, “This is the next dial.” The subsequent pro- 
cedure was repeated as for the first dial. * 

Comments on Experimental Set-up. In connection with the experimental 
set-up, the following points seem worthy of some discussion: 

1. The brightness of the pre-exposure, exposure, and post-exposure fields 
was equal. This permitted the subject to maintain a constant condition of 
adaptation. It is reported that a dark post-exposure field allows the retinal 
after-response to supplement the exposure. A very bright post-exposure field 
washes out the retinal image relatively quickly (25, 27). In general, constant 
light adaptation is preferable in studies of this type because if the pre-exposure 
field is dark, the time between successive exposures would need to be constant 
to provide comparable conditions. 

2. The illuminated area of the pre-exposure, exposure, and post-exposure 
fields was equal. This tended to reduce distraction which might have resulted 
from variations in the size of successive visual stimuli. ^ К 

3. The fixation area in the pre-exposure field was at the same optical dis- 
tance as.the stimulus object. This enabled the subject’s eyes to be properly 
focused and converged in advance of the brief exposure. | 

4. The pre-exposure and post-exposure fields succeeded each other without 
motion visible to the eye. A slow motion would have tended to cause a pursuit 
movement of the eyes and lead them away from the fixation area. 

5. There was an absence of distractive noises and moving parts. - 

6. The duration of exposures was of sufficient length to allow a clear view 
of the stimulus material but brief. enough to prevent successive views. Whipple 
(25, p. 226) reports that an exposure time of 0.15 second allows only one view. 

7. The ready signal was adequate to prepare the subjects for attending to 
the exposure. Absence of a rea у signal might result in shifting of view and 
momentary inattention and consequent poor performance. | 

8. The fore-period was of variable duration as recommended for this type 
of experimentation (7, p. 383). A constant fore-period leads to anticipation 

y the subject and too long or short a fore-period does not permit the subject 
to maintain a favorable “set.” ] 

9. The timing mechanism gave constant exposure times. 


Preliminary Experiment 


An initial pilot experiment was conducted previous to the main ех- 
periment. Data of this experiment were subjected to an analysis in 
order to evaluate the soundness of the proposed experimental design. 
Many of the experimental variables were controlled in the original ex- 
perimental plan but a knowledge of the influence of the subject and ex- 
posure time variables was desired. 

It was also hoped that this brief study would indicate whether or not 
further experimentation would be worthwhile, i.e., whether sufficient dif- 
ference would be noted in the subjects’ readings on the different dials. 


180 Robert B. Sleight 


Several exposure speeds were used, the objective being to determine 
an exposure speed such that sufficient errors would be committed, by 
randomly selected subjects, to allow distinction among the dial types on 
this basis. The exposure speeds actually used were as follows: 0.28, 0.20, 
0.17, 0.14, and 0.12 second. 


Results 


Preliminary Experiment 


The data obtained from the preliminary experiment with five male 
university students as subjects are shown in Tables 1 and 2. In these 
tables it will be noted that three variables are considered; namely, ex- 


Table 1 
Errors Made by Five Subjects in the Preliminary Experiment. 


Exposure Speeds in Seconds 


Subjects 0.28 0.20 0.17 0.14 0.12 
A ы! = ү; | о о 5 4 R2 У 6 
B S 2 R 2 У 6 Н 1 о о 
С У 10 H 6 Ora S 6 Е 0 
р о о 8 4 Е 4 У 12 н 2 
Е Е 3 у И H 8 о 0 ВЕ. 


* Dial types are designated as follows: H—horizontal, O—open-window, R—round, 
V—vertical, and S—semi-circular. 


Table 2 s 
Analysis of Variance of the Preliminary Experiment Data 
Sum of D f ors d 
m o legrees o opulati 
Source Squares Freedom Vinos" F Observed * 

Exposure 

Speed 7.6 4 1.9 0.40 
Subject 26.0 4 6.5 1.36 
Dial Type 169.2 4 42.3 8.87 
Residual 57.2 12 4.77 = 


* According to Lindquist (12, pp. 62-65), for degrees of freedom equalling 4 and 12, 
an F of 5.41 is significant at the 1% level. 


posure speeds, subjects, and dial types. These data when studied using 
a Latin square analysis of variance technique, showed that exposure 
speeds and subjects did not account for significant variance, while 
variance attributable to dial type was clearly significant at well 


Effect of Instrument Dial Shape on Legibility .181 


beyond the 1% level of confidence (12). This indicated that for 
further experimentation any of the exposure speeds used in the prelimi- 
nary experiment would probably yield discriminative data for the dials. 
It was indicated also that the subjects could safely be recruited from the 
student group. 


Main Experiment 


Percentage and Statistical Significance of Errors by Dial Types. Table 
3 shows the incorrect readings made by sixty subjects on each of the five 


Table 3 
Incorrect Readings Made by Sixty Male University Students on 
Each of Five Dial Types 
Dial Types 
Open- Semi- 
Horizontal Vertical Round window circular 

Incorrect 

Readings * 280 362 11 5 169 
Percentage 

Incorrect 27.5 35.5 10.9 0.5 16.6 
Mean Number of 

Errors per Subject * 4.67 6.03 1.85 0.08 2.82 


* N for the readings—1020. 
N for the gubjects—60. 


dial types. In this instance, the total incorrect readings include those 
Settings on which the subjects were unable to make readings. The per- 
centage of incorrect readings in Table 3 is based on a total of 1020 settings 
on each dial. The mean number of errors per subject is based on a total 
of sixty subjects used in the experiment. Figure 8 shows graphically the 
extent of the incorrect readings on the various dials. 

As shown in Table 3 and Figure 8, there was considerable variation 
in reading efficiency for the five dials studied, errors in reading ranging from 
0.5 per cent for the open-window type to 35.5 per cent for the vertical dial. 
The five dials ranked according to the percentage of error as follows: 


1. open-window.... 0.5% 
2.. round. „сы, 10.9% 
3. semi-circular.... 16.6% 
4. horizontal...... 27.5% 
5. vertical......... 35.5% 


182 Robert B. Sleight 


That the differences in accuracy of reading on the five dial types is 
significant throughout is shown in Table 4. "This table shows an estiniate 
of the significance of the differences in percentage of errors between each 
pair of dials. All of the “t” values shown in Table 4 are significant at 
well beyond the 1% level of confidence (19, p. 53). (Actually, the small- 
est “t” value obtained in these comparisons indicated that the chances 
are only one in 5000 that the difference could have occurred by chance 
variation alone.) Thus, all differences reported are clearly significant. 


PERCENTAGE OF INCORRECT READINGS 


HORIZONTAL VERTICAL ROUND 


OPEN- SEM- 
WINDOW — CIRCULAR 


DIAL TYPES 
Ета. 8. Percentage of incorrect readings made on five dial types. 


Although with the dials used in this experiment it is difficult to define 
the “actual” area covered by any dial, it appears that there was a definite 
positive relationship between what might be termed the effective area 
of the dial and the amount of inaccuracy in reading it. The open- 
window dial, with the smallest effective area, produced the least number 
of errors, while the rectilinear dials (horizontal and vertical) with greatest 
effective area resulted in a proportionately larger number of errors. 

Analysis of Errors by Pointer Settings. Figure 9 shows the frequency 
of errors according to the dial settings used for each of the five dial types. 
From this figure it is possible to visualize the relative number of errors 
made at different settings on each dial. It is evident that for nearly all 
dials more errors occurred on the mid-division settings than on the di- 
vision or whole-number settings. As a partial explanation for the occur- 
rence of most errors on mid-division settings one might hypothesize that 
when subjects were in doubt of the exact readings, but did have an idea 
of the general area of the reading, they would report a whole number 
rather than a mid-division reading. On the other hand, this explanation 
is illogical because of the fact that on whole-number settings a portion of 
the numeral to be read was obscured by the dial pointer. This perhaps 


О 


183 


Effect of Instrument Dial Shape on Legibility 


тїч[пәлә-ишәв— 6 'mopum-uado—O -punoi—jp '99ni9A—A  "ejuozuoq—H + 


60'€T се ozor 96 69702 өтет 6'9 99721 б 188 вүвүр jo леа 
чәчә 10j anea ««Ї» 

£210" 2510" 2010' +610 OLIO" 1810" 2810" 2910" ило 1080 — suoniodoid uooAjoq 
9oueiogip “Q's 

191" 150' тог esr ose" 9rc 60r 028° 99r 080° suonsodoid 
uooAjeq әәпәләр(т 

8-0 Su o-u S-A O-A H-A S-H о-н и-н A-H 
вид PIA 


„ SEA Oy} Burpvey ш epeyy вло: Jo suoryjodoiq uoeajeq воополоји с Oy} Jo oouvogruig 
T 91491, 


184 Robert B. Sleight 


poses the problem of eye-movements made by an individual in reading an 
instrument dial; that is, does the individual habitually interpolate be- 
tween the numbers on either side of the pointer or actually "see" the 
number on, or near, which the pointer is situated? 


— HORIZONTAL 
44 1 ~~ VERTICAL 
; --- коџмо 
“== ЅЕМІ- CIRCLE 
^H OPEN-WINDOW 


NUMBER OF ERRORS 
"m 
o 


SETTINGS 


Ета. 9. Total errors for each dial type according to the settings 
used in the experiment. 


Extent and Direction of Errors Made in Reading the Various Dials. 
Table 5 summarizes most of the data obtained in this experiment and 
emphasizes the extent and direction of errors made in reading the dif- 
ferently shaped dials. An examination of Table 5 shows that no errors 
exceeded plus or minus 2.0 dial units away from the true pointer setting. 
Eighty-four per cent of the erroneously reported readings were within 
plus or minus 1.0 dial units. This suggests that the subjects reading the 
dials could usually discern the general area in which the pointer was 
located, even though they were unable to make precise readings. 

The direction of the errors is indicated in Table 5 by the calculated 
constant errors on the dials. The constant error in the case of all dials 
was positive. The largest constant error of plus 0.0635 was obtained 
with the vertical dial. The next largest constant error was plus 0.0462 
with the round dial showing a tendency to overestimate with these forms: 

Tt will be noticed that in the calculation of the average and constant 
errors the number of blanks (omitted) readings was subtracted from the 
1020 total trials for the dial. The fact that only 27 times out of 5100 


Effect of Instrument Dial Shape on Legibility 185 


Table 5 
Extent and Direction of Errors Made in Reading the Various Dials 
* 


Number of Errors 
Amount Open- Semi- 
of Horizontal Vertical Round window circular Total 
Error + - + ee 
2.0 units 0 1 8 3 4 0 0 0 0 1 17 
1.5 units 7 5 10 a 3 0 0 0 3 1 36 
1.0 units 29 46 58 55 35 6 2 0 40 26 297 
5 units 106 78 131 78 36 25 1 1 48 46 550 
0 units 740 658 909 1015 851 4173 
Total + 142 207 78 3 91 521 
Total — 130 143 31 1 74 379 
Total 4- 
and — 272 350 109 4 105 900 
Blank 8 12 2 1 4 27 
Total 
errors 280 362 111 5 169 927 
Average 
error .2688 .3472 1071 .0039 ‚1624 
Х =1012 N = 1008 Х =1018 N =1019 Х =1016 
Constant 
error +.0119 +.0635 4-.0462 +.0020 +.0167 


N=1012 N=1008 N=1018 N=1019 N=1016 
Pee 7 101 созда лес rc LM 


trials were subjects unable to report a reading, suggests the high visibility 
of the pointer and /or the high degree of attention on the part of the par- 
ticipating subjects. 


Implieations of the Findings of this Experiment 


Essentially the purpose which instruments serve is to give indications 
of existing conditions. ў 

There is a modern trend to make more meaningful the information 
received from control mechanisms and instrumentation. It is obvious 

_ that the more realistic and the more meaningful instrument indications 
сап be made the greater will be the saving in time, elimination of errors, 
and resultant over-all efficiency. From the standpoint of efficiency, ex- 
cessive time spent looking at, or “reading” an instrument dial in order to 
gain information from it, is time wasted. 

The findings of this experiment show that with the use of certain dial 
types high accuracy of reading can be achieved even though the time 
during which the individual views the dial is very brief. 

The application of the findings of this experiment, especially that of 
the outstanding legibility of the open-window type dial, must be modified 


186 Robert B. Sleight 


by recognition of the purpose for which a dial is to be used. When it 
desired that the information from instruments be of a numerical natun 
the results of this study indicate that an instrument with an open-w ind 
dial is most desirable. (Counter-type dials were not studied in this e 
periment, but it is probable that they would show results similar to th 
obtained for the open-window dial) In certain situations, howe 
where the instrument is designed to give a representation of two- or three 
dimensional space, a dial giving numerical information seems to be M 
appropriate than one which provides a replica of the two- or three-dim 
sional plot (as in flight and navigation instruments of many іурез). 
In connection with the uses of certain dials studied in tho e;:perimen 
legibility might be a secondary consideration. For instance, aials of th 
round and semi-circular type offer decided engineering convenience п 
their use on this basis alone might be justified. Where direction, righ 
or left, and up or down, for example, is of value for increasing the mea 
fulness of the information presented by an instrument dial, the v 
either the vertical or horizontal dials would have advantages. It show 
also be noted that the findings of this study refer to the single instrumen! 
Further research is needed on the optimal design of instruments to | 
used in groups or banks. s 
To summarize: Many factors must be considered in the choice 
dial face design. When legibility, or accuracy of reading, is of 
importance, the open-window dial, with restricted area, seems to be 


erable to the circular dial, the semi-cireular dial, the horizontal, or tht 
vertical dial. p 


Summary and Conclusions ) 
1 l. Five instrument dial types—round, vertical, horizontal, sé 
circular, and open-window—were compared for legibility. Legi 
was measured in terms of accuracy of readings made by sixty male sul 
jects when viewing the dials for a brief period. 

2. The five dial types were equated for size and style of num 
marks and pointers, for contrast (black numerals, marks and poini 
on white backgrounds), for size and brightness of backgrounds, and 1 
positioning of pointer with respect to numerals and marks. Beca 
the variation in dial shape, the effective areas of the several dials v 
considerably. ү: 

3. Significant differences in ассигасу of reading were found for t 
several dials. When each dial was compared with each other, diffe 
= found which were significant in every case at the 1% level of 

lence. 

4. In order of accuracy of reading the dials ranked as follow: 
open-window; (2) round; (3) semi-circular; (4) horizontal; and (5) 


Effect of Instrument Dial Shape on Legibility 187 


Accuracy was extremely high (one-half of one per cent of readings in 
error) on the open-window dial. | 


5. Errors on all dial types were more frequent on mid-division than 


on whole-number settings, in spite of the fact that on whole-number 
settings much of the number was obscured by the dial pointer. 


Received January 5, 1948. 
Early publication. 


M. 


15. 


References 


. Beal, G. ^. Making the cockpit practical for the pilot. S. A. E. J., 1945, 53, 437- 


440, 496. 


. Behar, M. F. The manual of instrumentation. Pittsburgh: Instrument Publishing, 


1932. 


. Burtt, H. E., and Basch, C.  Legibility of Bodoni, Baskerville, Roman and Chalten- 


ham type faces. J. appl. Psychol., 1923, 7, 237-245. 


. Chapanis, A. (Unpublished data.) November 1, 1946. 

. Eaton, H. N. Aircraft instruments. New York: The Ronald Press, 1926. 

. Gall, О. C. Field instruments. J. of Sci. Instru., 1933, 10, No. 7, 197-203. 

. Garrett, H. E. Great experiments in psychology. New York: D. Appleton-Century, 


1941. 


. Grether, W. F., and Williams, A. C. Speed and accuracy of dial reading as a function 


of dial diameter and spacing of scale divisions. Wright Field, Dayton, Ohio, Aero 
Medical Laboratory, March, 1947. 


. Hibbard, D. L. Human engineering and instruments. Instruments, 1945, 18, 759, 


846. 


. Kelly, G. A. Notes on aircraft panel design. (Unpublished notes.) November, 


1945. 


- Lester, K. Let's make instruments flight-like. Air Facts, 1944, 7, 80. 
. Lindquist, E. F. Statistical analysis in educational research. Boston: Houghton 


Mifflin, 1940. 


. Loucks, R. B. Legibility of aircraft instrument dials: the relative legibility of 


tachometer dials. AAF School of Aviation Medicine, Project No. 265, Report 
No. 1, May 30, 1944. ; у 

Loucks, В. B. Legibility of aireraft instrument dials: the relative legibility of 
tachometer dials. AAF School of Aviation Medicine, Project No. 265, Report 
No. 2, October 27, 1944. КЕКЕ 

Loucks, В. B. Legibility of aircraft instrument dials: the relative legibility of 
various climb indicator dials and pointers. AAF School of Aviation Medicine, 
Project No. 286, Report No. 1, November 25, 1944. 


. Loucks, В. В. Legibility of aircraft instrument dials: the relative legibility of 


manifold pressure indicator dials. AAF School of Aviation Medicine, Project 
No. 325, Report No. 1, December 7, 1944. 


- McFarland, В. А. Human factors in air transport design. New York: McGraw- 


Hill, 1946. 


. Paterson, D. G., and Tinker, М. A. How to make type readable. New York: 


Harper and Bros., 1940. 


. Peters, C. C., and Van Voorhis, W. R. Statistical procedures and their mathematical 


bases. New York: McGraw-Hill, 1940. 


. Riggs, L. A. (Unpublished data.) November 28, 1944. 


188 Robert B. Sleight 


21. Stewart, C.J. Aircraft instruments. New York: John Wiley, 1930. 

22. Vernon, M. D. Scale and dial reading. Cambridge University: June, 
(Medical Research Council Unit in Applied Psychology. 

23. Wallace, L. W. Intelligence and instruments. Instruments, 1940, 13, No. 7, 
185. , 

24. Warren, Н. С. Dictionary of psychology. Boston: Houghton Mifflin, 1934. 

25. Whipple, 6. M. Manual of mental and physical tests. (1st ed.) Baltimore: 
wick and York, 1910. 

26. Whitehead, T. N. Тће design and use of instruments and accurate meci 
New York: Maemillan, 1934. 

27. Woodworth, R. S. Ezperimental psychology. New York: Henry Holt, 1938. 

28. Editorial (anon.). West. Flying. February, 1946, 22, 44. 

29. peus Britain. Air Ministry. FPRC 353. Night vision subcommittee, 

9, 1941, 


30. Anon. Elec. Mfg. When your product needs dials or indicators. June, 1939, 
61, 80. > 


Cumulative Effect of a Series of Campaign Leaflets 


R. W. Dietsch 
Cleveland, Ohio 


and 


Herbert Gurnee 
Arizona State College 


The effect of printed propaganda on the development of social atti- 
tudes has been investigated in several experimental studies. The pur- 
pose of those studies was usually to compare printed with oral material, 
or to compare different kinds of printed propaganda, for example, emo- 
tional versus rational appeals; the material used consisted of a single 
leaflet or folder. 

There seems to have been no experimentally controlled attempt to 
measure the cumulative effect of a series of propaganda leaflets. Reports 
from the field of commercial advertisement indicate a law of diminishing 
returns in the repetition of some advertising copy, and presumably a 
similar result occurs in other kinds of publicity. It is well known that 
the public soon becomes satiated with repeated political propaganda; 
long before a campaign is over many complaints are heard about the in- 
terminable “hot air" of the political office seekers. 

Тће present study is concerned with the effect of a series of leaflets on 
the student-body opinion of a man’s college. Since our interest was in 
changing rather than in developing attitudes, we purposely sought an 
issue in which interest was high and about which diverse opinions had 
been generated. Such an issue was available in the question, widely 
discussed on many college campuses, whether the athletic program should 
be subsidized in order to provide a football team capable of competing 
with the best in the country, or whether athletics should be kept within 
purely amateur and recreational limits. 

!F. Н. Knower. Experimental studies in changes in attitudes. J. Abn. & Soc. 
Psychol., 1936, 30, 522-532, С. W. Hartmann. А field experiment on the comparative 
effectiveness of "emotional" and "rational" political leaflets in determining election 
results. J. Abn. & Soc. Psychol., 1936, 31, 99-114. 


189 


190 R. W. Dietsch and Herbert Gurnee 


Procedure 


During the third week of the football season, when the team seem 
headed for a mediocre season, the following ballot was placed in the mail 
box of every undergraduate in the college, with the request that it be 
checked and dropped into the box of the editor of the college paper. —— 


Do you think the college should subsidize athletics 
lo such an extent that its team can compete success- 
fully against the top ranking schools in the country? 
Absolutely Yes. Yes. ? No. Absolutely No. 


The ballots were ostensibly secret; actually each ballot was пите) 
on the back and the returns were recorded according to the name of tl 
mailbox assignee; 427 students returned the ballot, approximately 609 
of those receiving it. Their votes were distributed as follows: 


Absolutely Yes: Yes: t No: Absolutely No: 
49.0% 25.1% 23% 103% 13.6% 


Some students amplified their responses in the form of essays attached 
to the ballot. There were 18 football players and, to a man, they voted 
“Absolutely Yes.” ] 

On the basis of the returns, the 427 students were divided into fo ur 
groups so arranged that the percentage of votes falling in each of the five 
categories of response was the same in each group. That is to say, each 
of the four groups was made to comprise 49% who voted "Absolutely 
Yes," 25% who voted “Yes,” and so on as above. One of the group 
constituted a control and received no leaflets; a second group received on 
leaflet, a third received three leaflets, and a fourth five leaflets. 
leaflets were distributed a week apart on a Tuesday afternoon, a tim 
approximately mid-way between football games. The content in all leaf: 


lets was strongly against subsidization. The first of the series read £ 
follows: 


eaa Not until we have at least a union, or а soda grill, or even 8 
orm. 


Cumulative Effect of a Series of Campaign Leaflets 191 


The second, third and fourth pamphlets were similar in form and 
content to the first. The fifth was somewhat different. It is given 
below: 


“An excellent answer to those men who would have athletics subsidized 
сап be found in the following letter written by A. J. Е. of R.... He writes: 
‘Subsidized athletics are on the way out all over the United States—witness 
Chicago, M. I. T., Johns Hopkins, and many others. A school that needs a 
powerful athletic machine to advertise itself is not worthy of being called a 
school—it’s just a play ground. At any rate, R... does not need any greater 
lure for students than its academic departments. Being from the South, I 
know what a great reputation it has. у not more subsidies to the various 
academic departments? Intra-mural and local athletics such as В... goes in 
for seem to provide enough school spirit and athletic activity for all normal 
purposes. To subsidize and thus enlarge the athletic department seems to me 
to be a needless and undesirable move. В... is, has always been, a SCHOOL 
first of all. Enough “glory to old R . . .” is given by its great teachers. Edu- 
cation is, after all, the primary purpose of any true college.’ ” 


The leaflets stimulated much discussion among the students. The 
football squad and even the coaches talked vigorously about them. Sev- 
eral students wrote to the college newspaper and demanded that they be 
stopped and that the identity of the distributor be revealed. The fact 
that the experiment was being carried on during the football season un- 
doubtedly added to the heat of the discussions. 

One week after the final leaflet was distributed, ballots identical with 
the first were again placed in the student boxes. Not all of these were 
returned, and some that were returned had to be discarded to equalize 
the groups on the basis of the initial ballots; the groups obviously had to 

equalized in the five categories of response before sound comparisons 
could be made. This left 350 subjects from whom usuable returns were 
tabulated. 


Results 


Our data give a measure of the effects on one, three, and five leaflets. 
We had hoped to measure the effects of the second and fourth leaflets 
also, but there were not enough subjects to justify a division into two 
additional experimental groups. We could have obtained such a measure 
by interjecting extra ballots at these points, but this would have intro- 
duced a variable which we thought it advisable to avoid. | 

The first leaflet produced a significant decrease in favorable opinion : 
toward subsidization. The per cent who were “‘Absolutley” in favor 
dropped from 49 to 16.1. 'The measure was obtained, of course, five 
Weeks after this first leaflet was distributed, since the final ballot was 
Blven to all groups at the same time, namely, one week after the fifth 
leaflet was distributed. During this same period the control group 
dropped from 49 per cent to 42.5 per cent in the “Absolutely Yes" cate- 


192 R. W. Dietsch and Herbert Gurnee 


gory, а loss of only 6.5 per cent. Assuming the groups to be comparable, 
and they were comparable in distribution of original opinions at least, it 
seems evident that the single leaflet produced a definite positive effect a 
far as expression of opinion is concerned. - 

The various changes are presented in Table 1; the figures represent loss 
or gain in votes in relation to position on the original ballot. The great- 
est changes were in the “Absolutely Yes" category; thus the one-le 
group dropped 32.9 per cent (from 49.0% to 16.1%), the two-leaflet gro 


Table 1 


Loss or Gain in Per Cent of Votes in the Five Categories of Response А 
Note: The loss or gain (— ог +) is with reference to the total vote on the origina 
ballot. 


One Leaflet Three Leaflets Five Leaflets 


Absolutely Yes —32.9 —37.5 —381.8 
Yes +12.6 +10.2 +114 
Undecided T17.2 4-23.9 +17.0 
Хо + 12 0 +11 
Absolutely Хо + 2.3 + 3.4 + 2.3 


~. 
17.2%). These changes from the “Absolutely Yes” position caused & 
necessary increase in certain of the lower categories, with the neutral 
position apparently accumulating most of the shifts. И. 
"Three leaflets obviously produced a slightly greater effect than о 
but the difference is strikingly small and seems harldy enough to j 
the additional expense and effort involved. 'Thus, with respect to 
results of the three-leaflet series, 88 per cent of the work appears to he 
been done by the first leaflet. What may have been the situation 
the second leaflet we cannot say, but it seems extremely doubtful in vi 
of the trend that the second pamphlet accomplished anywhere near the 
effects of the first. б 
More surprising is the effect of the five-pamphlet series. The e 
results are almost exactly the same as for the first leaflet. ‘The law 
diminishing returns seems to be working here with a vengeance! The 
slight gain on the third pamphlet is wiped out. Five pamphlets are 
no better than three. These results are the more striking when we 
member that the time interval between the leaflets and the final b 
was least advantageous to the first and most advantageous to the 
leaflet; thus the superiority of the first leaflet must have been even 
than the data show. 
Of course the content of the last pamphlets may have been respon 
for this decline, although this is doubtful; it is more likely a norm 


Cumulative Effect of a Series of Campaign Leaflets 193 


fractoriness to what was taken to be an overdose of propaganda. An- 
other possibility is the influence of some unknown extraneous influences, 
perhaps newspaper discussions during the intervals; but if such influences 
were present, they do not appear in the votes of the control group. 

Then again, the factor of timing may have made a difference. One 
week apart may have been too short an interval; it is difficult to believe 
it may have been too long, for refractoriness is known to increase with a 
shortening of the time interval. With intervals of a month possibly the 
effects would have been more cumulative; although with longer intervals 
the element of forgetting naturally becomes greater. 

Another element in timing is to hit an interest when it is most active. 
Our final leaflet and the final ballots were near the close of the football 
season, when student interest in football issues was possibly on the de- 
cline. In this particular college the most crucial game is traditionally 
the last, and the spirit then is at the peak; but there still may have been 
a drop of interest so far as issues like subsidization were concerned. 

The data show another interesting fact, one that has been observed in 
studies of debate audiences, namely, that very few subjects change from 
one side of the neutral point to the other. Thus there were almost no 
additions to the “No” and the * Absolutely No" categories of response. 
Twenty-four per cent of the students were opposed to subsidization at 
the beginning, and their number was increased by only 3 or 4 per cent at 
the end; this is practically no change since the control group shows almost 
the same results. Most of the changes were toward or into the neutral 
positions from the “Yes” positions; there ,a psychological barrier seemed 
to prevent all but a very few from crossing over. This "barrier" is an 
interesting problem for further research ; what conditions affect its per- 
meability, and how? 


Table 2 


Degree of Change Resulting from the Leaflet Series n 
Note: Figures are the per cents of subjects changed towards the "Absolutely No 
end of the scale. 


—————————————————————ЄЄ 


One Leaflet Three Leaflets Five Leaflets 
о ОпеІеайеі Threleafets Five Leaflets 
One Step 35 42 36 
Two Steps 10 12 10 
Three Steps 1 2 0 
Four Steps 0 0 0 


Another indication of resistance is the small number of subjects who 
changed more than one step in the scale. The figures are given in Table 
Figures are for changes in the intended direction. Changes in the 
other direction were extremely small, totalling not more than five per 


194 R. W. Dietsch and Herbert Gurnee 


cent. Note that only three per cent of the subjects changed more than 
two steps, and this could be expected from chance alone. Approxi- 
mately ten per cent changed two steps, in most cases from “Absolutely 
Yes" to “Undecided.” 

Here again, it can be seen that three leaflets produced only a slightly 
greater effect than one, and the effect of five leaflets is almost exactly 
that of the first in the series. 

It should be pointed out that the above effects might have been difier- 
ent a month or six months after the series ended. Although the amount 
of change produced by five leaflets was no greater than that produced by 
one leaflet it is quite possible that this change may have been more 
enduring. 

Summary 


1. Several hundred college men were presented a series of leaflets 
against subsidization of college athletics. The leaflets were distributed 
one week apart, after an initial ballot of student opinion had been taken. 
A control group and three experimental groups were set up on the basis of 
the initial ballot. A final ballot was taken upon completion of the leaflet 
series, 

2. The group receiving one leaflet showed a significant change in the 
intended direction, forty-six per cent of the subjects shifting towards the 
“No” end of the scale. 

8. The group receiving three leaflets indicated a slightly greater 
change, but hardly enough to justify the additional effort and expense. 

4. The group receiving five leaflets manifested almost exactly the 
same amount of change as the group receiving but one leaflet. 

Received October 14, 1947. 


А Validating Study of the Work Preference Inventory 


George D. Lovell, Hartwell Davis, and Alfred Meacham 
Grinnell College 


Robert W. Henderson of Massilon, Ohio, has published an instru- 
ment called the Work Preference Inventory (4) which attracted the at- 
tention of the authors because it proposes to secure an indication of both 
interest and personality traits through the administration of one set of 
test items. Such an instrument, properly validated, would be of great 
use to college counselors in suggesting vocational fitness for occupations 
with known requirements and for indicating areas needing special counsel. 

The 1946 manual for the inventory gives measures of validity ranging 
from a bi-serial correlation of .71 to one of .98. The high and low ratings 
for the bi-serial correlation were determined by contrasting the answers 
given by well adjusted soldiers who had never been referred to a neuro- 
psychiatric clinic with answers given by soldiers who were to be dis- 
charged for neuropsychiatric reasons. In a personal communication to 
one of the’ authors Mr. Henderson indicated that the high validity coeffi- 
cients were due to having a very neurotic group to compare with the 
normal. He also suggested that further study of the inventory was 
needed with other groups and was being planned by several organizations. 

This report covers the comparison of the personality scores made by 
college students with ratings of these students by close acquaintances. 
Thus it introduces a different kind of validation from that used by the 
author of the inventory and checks the validity of the existing scale for 
college students instead of neurotic soldiers. An indication of the in- 
и a usefulness for college advisement should result from such a 

udy, 

. _ The Work Preference Inventory gives ten personality scores and seven 
interest, scores, although, according to the author, the testees will not 
realize it is anything other than an interest test. The personality traits 
measured are reliability, perseverance, emotional stability, creativeness, 
Conservatism, ambition, masculinity, introversion, anxiety-depression, 
and neurotic index. Interest areas listed are persuasive, social service, 
theoretical, artistic, mechanical, economie, and scientific. 

The test is comprised of pairs of job descriptions such as, “do or would 
you prefer work that is 


196 G. D. Lovell, Н. Davis, and А. Meacham 


The testee is asked to show his preference for one job or the other, 
Тћив in the sample above he may indicate his degree of preferences as 
follows: Prefers inside work strongly; Prefers inside work; Likes bot 
Prefers outside work; Prefers outside work strongly. 4 

The test was constructed to be used as an employee selection tool, or 
аз а guidance and clinical instrument. Ву disguising the pers 
component of the test, it was felt that a more honest evaluation of p 
sonality could be gained. 


Procedure 
Тће inventory was administered to а class of eighty-two college 


eight male and fifty-four female students, of sophomore and junior clas 
standing. In administration, precautions were taken to insure that th 
testees understood the test directions properly and that they were un- 
crowded and unhurried. The test was administered in a serious manner 
and was accepted in a like manner by the students. 
The criterion of validity was obtained by the use of a graphic га 
scale constructed for the purpose. Its construction was as follows: T 
test items that influenced scoring on each trait were listed along with 
author’s definition of each trait in an attempt to make the rating 
conform as nearly as possible to the inventory. Since the test items 
not in objective terms of any specific type of behavior, it appeared i 
practical to relate the terms of the rating scale to each specific item of 
test. The rating scales were therefore developed on an a priori basis | 
the author’s definition of each trait. In most instances one scale ¥ 
developed for each trait ; however, it was necessary to develop more th 
one scale for some of the traits, in order to measure all the components 0 
the trait as described by the author of the Work Preference Inventory- 
Nine of the ten personality traits listed by Henderson were chosen 
study. The tenth, neurotic index, was derived from a particular we 
ing of heterogeneous items and was thought to be too complex for тай 
The first draft of the graphic rating scale was presented to a cla 
fifteen junior and senior college students who had been studying та 
; scale construction. This group made many suggestions concerning 
formity of the proposed scales to the trait definitions and concerning 
wording of the scales so as to achieve apparently equal psychologi 
spacing of the descriptive guides placed under each rating line. Ё 
effort was made to make certain that the group fully understood the 
and purpose of the study, and their suggestions were in essential а 
ment. The attempt was made to combine the best features of a num! 
of sample rating forms so that the final scales would conform to pro} 
principles of rating scale construction (3, 7, 9, 10, 12). 


A Study of the Work Preference Inventory 197 


А sample of the actual rating form as shown in Figure 1 will demon- 
strate the general nature of the scale. 


1. How does he get along with strangers? 
| | | | | | 


в completely nearly al- reasonably some- 
poised and ways sure of sure of him- what uncom- nothing 
converses himself and self and can fortable; to say; is 
quite freely can easily en- carry ona has little ill at cane 
gage in con- fair conver- to say 
versation, sation 


2. How does he react to social gatherings? 


| | | | ——— 
Retiring; very Limits con- іхез аз wi ways seil- 
чі ts to assured; the 


one 
self-conscious tac as the aver- the livelier ; 
and ill at one or two age members of "live wire” 
елле persons the group of any group 


Fic. 1. Sample of rating form used in obtaining the criterion. 


Each scale, as can be seen, was written in terms of a particular type of 
objective behavior. The nature of the behavior under consideration in 
each case was keynoted by an initial question and then descriptive guides 
were put in terms of the same type of behavior. The descriptive guides 
were subordinated to the keynoting question by printing them in smaller 
type. The continuity of the rating line was emphasized by making the 
horizontal line considerably heavier than the vertical division marks. 
Thorough reading of the descriptive guides was encouraged by alternating 
the high-low direction of the rating line on successive items. Ample 
Space was allowed between the items in order to insure positive separation 
of ideas, and a space for comments was provided at the end of the report. 

The rating report was considered to be a valid rating of the traits 
under consideration for the following reasons: 


1. The best available techniques of graphic rating scale construction 
were employed. 

2. In the maj ority of cases there was agreement among the four raters 

of each individual. In less than 20%, of the cases were there discrep- 
ancies of more than four scale points (out of ten) among the raters for 
each subject. 
. 9. The individual scales of the rating report were checked by visual 
Inspection of the frequency curves. In some instances, the curves were 
slightly skewed but in each case the skew was in the direction to be ex- 
pected from the group used as subjects.: For example, the resulting 
Emotional Stability curve indicated that the group tended to be more 
emotionally stable than the typical population. 

Two weeks after the inventory had been administered, the 82 subjects 
Were told the nature of the study being made and each was asked to list 
five of his closest campus friends. The names were listed on a mimeo- 


198 G. D. Lovell, Н. Davis, and А. Meacham 


graphed form in order of intimacy, together with information concerning 
length and nature of association. Four of the five friends listed by each 
subject were asked to rate him. Assistants were present during the 
scheduled rating hours to answer questions and check over the instruc- 
tions with each rater. In all contacts with the raters, emphasis was 
placed on the fact that the rating data would be treated in strict confi- 
dence. Excellent cooperation was demonstrated by all students involved. 
The serious attitude of the raters was demonstrated by the fact that a 
high percentage of the students supplemented the rating report by adding 
comments in the space provided. 

The general method of scoring and averaging the rating scales and 
the method of correlation used are as follows: 

In most cases one scale was developed to measure each trait; however, 
in some instances a single scale was felt to be inadequate and two or more 
scales were used to measure the component parts of those traits. When 
more than one scale was used, the scales were weighted in such a way 88 
to place each trait score on an equal basis. 

The four rating scales for each subject were then averaged by traits 
and recorded along with the test scores for the corresponding traits. А 
frequency distribution for each trait was compiled from the averaged 
rating scale scores and а mean and median were determined for each 
distribution. The rating scale scores were then labeled with a plus ог 
minus depending upon comparison with the central tendency. A bi-serial 
correlation was then used to compare the test scores with the plus and 
minus rating scale scores. 


Results 
The results shown as bi-serial r’s between the test and averaged ratings 
for each trait are presented in Table 1. 


Table 1 


Bi-serial correlations between ratings and test scores for 9 traits. 
Х = 82 college students 


=————————————————————= 


Hus PEDE a ојача Ме — — — 
Reliability 08 +: .09 
Регвеуетапсе .26 + .09 
Emotional Stability —.14 + .09 
Creativeness 14 + .09 
Conservatism —.07 + .08 
Ambition 19 + .09 
Masculinity 39 + .08 
Introversion 31 = .09 
Anxiety-Depression 25 + .09 


À ~ 


A Study ој the Work Preference Inventory 199 


Conclusions 
1. On the basis of the a priori rating report devised for this study, it 


would appear that none of the nine measurés of the Work Preference 
Inventory, as listed in Table 1, is valid as a measure of personality traits 
of а normal college population. 


2. Until further work is none to improve the test, its usefulness as а 


clinical tool in college counseling would be of doubtful value. 


3. Since the subjects used in this study were presumably representa- 


tive of а normal college population, these results would not discount the 
value of the test as a clinical aid with a more deviant population. 


Received September 23, 1947. 


References 


. Clark, W. A., and Smith, L. J. Further evidence on the validity of personality 


inventories. J. educ. Psychol., 1942, 33, 81-91. 


- Darley, J. G., and McNamara, W. J. Factor analysis in establishing new person- 


ality tests. J. educ. Psychol., 1940, 31, 321-334. 


- Guilford, J. P. Psychometric methods. New York: McGraw-Hill, 1936. 
. Henderson, Robert W. Work Preference Inventory апа Manual of instructions for 


the Work Preference Inventory. Massilon, Ohio: Robert W. Henderson, 1946. 


..Lynch, J. The psychology of the rating scale. Educ. Adm. Supervis., 1944, 30, 


497—501. 


- Robertson, А., and Stromberg, E. L. Agreement between associates ratings and 


self ratings of personality. Sch. & Soc., 1939, 50, 126-127. 


- Stevens, S. N., and Wonderlie, E. F. An effective revision of the rating technique. 


Person. J., 1934, 13, 125-134. 


. Strong, E. K., Jr. Weighted vs. unit scales. J. educ. Psychol., 1945, 36, 193-215, 
‚ Symonds, P. M. Diagnosing personality and conduct. New York: Appleton- 


Century, 1936. 


- Tiffin, J., and Musser, W. Weighting merit rating items. J. appl. Psychol., 1942, 


26, 575-583. 


- Tiffin, J. Industrial psychology. New York: Prentice Hall, 1942. 
- Weinland, J. D. Better words on rating scales. Person. J., 1946, 25, 181-134. 


Influence of College Science Courses on the Development о К 
Attitude Toward Evolution 


Key L. Barkley 
Woman's College of University of North Carolina 


Some doubt has been raised as to whether the curriculum studied in 
in college makes much difference with respect to changes in stude 
attitudes toward such things as law, the church, the constitution, 
and God. These attitudes are more or less general in nature and perh 
not very specifically related to any introductory college course or curri 
lum. It would appear, however, that attitude toward evolution wo d 
be more specifically related to studies of science, and subject to chang 
by reason of advance in such courses. 


Plan of the Experiment 


Purpose. The general purpose of the present investigation was to | 
bring out any discoverable curriculum influences on development of stu- 
dents' attitude toward evolution. The specific purposes were: (1) to — 
find out whether study of science and mathematics in high school had any 
relation to students’ attitude toward evolution at the time they entered - 
college; (2) to discover the changes in attitude toward evolution made by 
college freshmen in a regular college course which included two semesters 
of biology, mathematics, or chemistry, or a combination of two semes ers 
each of biology and chemistry; (3) to compare the changes in freshmen’s 
attitude toward evolution with those made by students in a one year com- 
mercial course which had no science studies in it; (4) to discover 
changes in attitude toward evolution made by upper class students who 
took a freshman science course, or an advanced course in anatomy. | 

Subjects. The Freshman science groups were composed of those who 
were studying a course in introductory biology, chemistry, or ma 
matics, or courses in both biology and chemistry at the same time, Б! 
who otherwise took the same general courses. These were all sep: 
and distinct groups with each student appearing in only one group. 
upperclassmen groups were composed of the students above the fresh 
level who were in the freshman science courses, or who were taking 
advanced course in anatomy. (The anatomy course lasted just 9 
semester.) The commercial students were admitted to the college 

the basis of the same general high school credits as required of the f 
200 


College Science Courses and Evolution 201 


men, but the commercial students took a course which was strictly a 
business college curriculum not including any science or mathematics. 
A special group of freshmen who took introductory chemistry both semes- 
ters was also tested. 

Materials. The measuring instrument used was the Attitude Toward 
Evolution Scale made by T. G. Thurstone under the editorship of L. L. 
Thurstone, Scale 30, Forms A and В. 

Procedure. The method of test-retest was used. Form A of the scale 
was given to the commercial students and to all students in the biology 
and anatomy groups in the fall. All the other groups were given forms 
A and B in approximately equal numbers in the fall. Each student was 
then retested at the end of the course in the spring with the form which 
he had not marked previously. Some of the biology students were re- 
tested at the end of the first semester with Form B. These students had 
to be given Form A at the end of the year. 


Results 
The distribution of time given to mathematics and science in high 
school by the various groups of subjects is shown in Table 1. The cases 
of reliable differences in per cents of the groups which took the different 


Table 1 
Showing the Per Cent of Each Group of Subjects who had Mathematics and the 
Various Sciences in High School 


College Students Grouped According to Subjects Taken 


Biology 
Chem- Mathe- and Commer- 
H.S. Course Biology istry matics Chemistry cial 
General Science—1 yr. 42.3 45.9 44.6 60.7 71.8 
Biology—1 yr. 82.2 80.7 78.2 83.3 817 
2 yrs. 2.7 10.9 3.6 

Chemistry—1 yr. 34.2 523 34.6 . 59.5 14.1 
Physics—1 yr. 15.1 138 , 40 19.0 225 
Mathematics—1 yr. 18 10 14 
2 yrs. 41 11.9 10.9 9.5 12.7 

3 yrs. 79.5 59.6 33.7 56.0 70.4 

4 yrs. 16.4 26.6 54.5 34.5 15.5 


a _- ме == _________--- 


science courses in high school are presented in Table 2. By the absence 
of the comparisons in Table 2, it will be noted that there were no signifi- 
cant differences between the freshman groups who elected college biology, 
chemistry, or mathematics in per cents of the groups who had studied any 
of the sciences in high school. Moreover, there were no significant dif- 


202 Key L. Barkley 


ferences between any of the groups in per cents of them which had studied: 
biology in high school. (Note absence of comparisons between the groups 
with respect to study of biology in high school.) It was found, howe 
that a reliably greater per cent of the freshman group which took b 
biology and chemistry in college than of any other group, except the one 
which, elected chemistry in college, had studied chemistry in high school, 


Table 2 
Showing the Cases of Reliable Differences between Groups in Per Cents which 
» . Mesas the}Various Sciences in High School, and the Reliability Indices 
of those Differences * (Taken from Table 1) 


College Students Grouped According to Bubjects НИ у 
iol 
Biology Chemistry Mathematics 


D D D 
Веб; р D; De? aed, 
Biology and Chem. 
Group 
H.S. Gen. Sci. 184 2.33 -148 2.10 -161 2.21 
H.S. Chemistry .253 3.61 249 3.51 
HLS. Physics -150 3.20 
Commercial Group 
HLS. Gen. Sci. 295 3.78 259 3.60 272 3.78 
H.S. Chemistry —.201 3.30 —.382 6.06 —.205 3.25 — 454 
HLS. Physics 185 3.49 


* Тће + H а 8. ч 
except coin ds pens Кен calidus d the E сој ame. uL 
and that a reliably smaller per cent of the commercial group studied che 
istry in high school than of any of the freshman groups. It is shown 
that a reliably larger per cent of the commercial group studied ge 
science in high school than of any other group, except the one which to 
both biology and chemistry in college. 5 

Since, as it will be shown later, all but one of the freshman groups had 
mean scores showing a more favorable attitude toward evolution th 
that held by the commercial group, it would appear that election of 
eral science in high school to the neglect of the more specific scie 
courses is associated with a less favorable attitude toward evolution wh 
the students get to college. This finding is not unequivocal, howev 
since one freshman group was not reliably more favorable in attitude. 
ward evolution than the commercial group. 

The fact that the commercial students showed а less favorable 
tude toward evolution than the freshman groups should not be interp: 
as being due to what the students learned in general science. Forty 


College Science Courses and Evolution 203 


sixty per cent of all the freshman groups studied general science in high 
school. Moreover, the freshman group which elected both biology and 
chemistry in college gave so much time to general science in high school 
that it was not reliably different from the commercial group on this score. 
It also will be noted in Table 2 that the biology and chemistry freshman 
group emphasized the study of general science in high school more than 
any other freshman group; the critical ratios of the differences were all 
2plus. Even with this extra emphasis on general science, the biology and 
chemistry freshman group had the most favorable attitude toward evolu- 
tion shown by any group. 1% appears, then, that the study of general 
science is not in itself a hindrance to the development of a favorable atti- 
tude toward evolution, but that a more favorable attitude tends to be 
developed if the study of general science is supplemented by adequate 
emphasis on the study of more specific science courses. 

Тћеге is some indication that attitude toward evolution is associated 
with the amount of time spent studying mathematics and science in high 
school. Atthe first testing, the commercial group had the least favorable 
attitude toward evolution and the group which elected to study both 
biology and chemistry in college had the most favorable attitude. The 
commercial group spent the least time on mathematics and science in 
high school and the freshman group which elected both biology and chem- 
istry in college spent the most time on these subjects. The following 
tabulation showing the average number of years spent on mathematics 
and science in high school by all the groups will make this difference 
plain: Commercial, 4.9; Biology, 4.92; Chemistry, 5.04; Mathematics, 
5.25; Chemistry and Biology, 5.55. 

The more favorable attitude toward evolution shown by the freshmen 
аз compared with the commercial students probably is associated with а 
greater interest in science on the part of the freshmen. Evidence of this 
probability is found in the greater amount of time given by the freshmen 
to science and mathematics in high school, their election of a liberal arts 
course instead of a commercial course, and in the highly favorable score of 
the biology and chemistry group which was composed of majors in those 
two fields. ` ^ ; 

Тће mean scores of the several groups of subjects at the time of both 
testings are shown in Table 3. These scores show that in the fall the 
freshman groups who chose biology, or mathematics, or both biology and 
chemistry for their first year science courses were neutral in attitude, 
according to a scale furnished by the makers of the test. Those who 
elected chemistry only were slightly prejudiced against evolution. The 
commercial studerits also showed mild prejudice against evolution. The 
upperclassmen who took freshmen science courses were neutral in atti- 


204 Key L. Barkley 


tude, but those who took the anatomy course were more advanced in 
science and were believers in evolution. In the spring, the group studying 
both biology and chemistry and the upperclassmen in a freshman science 
had stepped up to a position of belief in evolution. The formerly pre- 
judiced chemistry group moved up to a neutral position. The other 
groups remained in the same general positions they had held at the first 
testing, even though some changes in attitude toward evolution had been 
made. 
Table 3 


Showing Mean Fall and Spring Scores of All Groups of Students, Differences between 
the Mean Scores, and the Reliability Indices of the Differences * 


Fall Spring E 

Group N Mean Score Mean Score Рі. саш. 
Biology 73 5.35 5.81 46 3.01 
Chemistry 109 4.95 5.46 51 4.47 
Mathematics 101 5.33 5.63 .30 2.48 
Biology and Chemistry 84 5.62 6.41 79 5.81 
Commercial | т 4.65 4.41 —.24 1.70 
Upper class students 33 952 6.01 19 1.82 
Chemistry repeats 26 5.18 5.58 45 2.00 
Anatomy 60 6.47 6.42 —.05 40 


* The formula for correlated measures was used in calculating the critical ratios in 
this table. 


Table 3 also shows the differences between the mean scores of the 
various groups of subjects at the first and second testings. It was found 
that the freshman groups which studied biology, or chemistry, or both 
biology and chemistry at the same time made statistically significant 
changes in attitude during the year, but those who studied mathematics 
achieved a change with a critical ratio of only 2.48. Commercial stu- 
dents, upper classmen in a freshman science, chemistry repeaters, and 
upper classmen in an anatomy class did not make reliable changes 
in mean scores, 

Table 4 shows the reliability of the differences between the degrees 
of change in mean scores made by the different, groups. There was no 
statistically significant difference in degree of change made by any of the 
freshman groups as compared with any other. Only the freshmen who 

1 As indicated in Table 3, the correlations between the fall and spring scores of the 
various groups were worked out and used in determining the critical ratios of the changes 
made in attitude. The following correlations between fall and spring scores were found: 
Biology group, 49; Chemistry group, .57; Mathematics group, 42; Biology and Chem- 


istry group, .47; Commercial group, .55; upper class students in a freshman science, .81; 
chemistry repeats, .49; Anatomy group, .60. 


College Science Courses and Evolution 205 


studied both biology and chemistry made a change which was by the most 
exacting criterion reliably greater in degree than that made by the com- 
mercial students (critical ratio of 3.75). However, the critical ratios of 
the differences in change between the commercial and the other freshman 
groups was in every case above 2.00, and these ratios indicate a high prob- 


Table 4 


Showing the Reliability Indices of the Differences Between the Degrees of Change 
Noted in the Various Groups of Students Following 


а Year's Study in College * 
——Є—ЄЄ—ЄЄ—ї———Є 
Biology апа 
Biology Chemistry Mathematics Chemistry 
4—4 4—4; ^ 4-4 4—4 


D = od;—d: D odi—dz D odi—ds D odi—d: 
"— i А с ИЦ жал ИЛА I 


Biology 05  .18 98 . ЈЕ 
Chemistry 28. 1.10 
Mathematics 16° æi p en) 49 2.00 
Commercial ло 241 75 282 54 210 103 3.75 


* The groups named at the top showed the greater change as compared with the 
groups named at the left. 


ability of significant differences. Moreover, all the freshman groups 
changed toward a more favorable attitude in sufficient degree to produce 
a reliable difference between their mean scores and the mean score of the 
commercial group at the second testing. 

Table 5 shows the differences between the mean scores of the several 
groups of subjects at the time of both testings. It is noted that the fresh- 
man groups were not reliably different from each other in the fall, except 
that the group which studied both biology and chemistry was reliably 
more favorable in attitude toward evolution than the group which elected 
chemistry alone. (It should be pointed out, however, that both the 
biology and the mathematics groups had higher mean scores than the 
chemistry group, and that the critical ratios of the differences were above 
2.0.) АП the freshman groups, except the one which studied chemistry 
only, were reliably more favorable in attitude toward evolution than the 
commercial group. At the spring testing, the group of freshmen who had 
had both biology and chemistry was reliably more favorable in attitude 
toward evolution than any of the other freshman groups. The freshman 
groups who studied biology, chemistry, or mathematics were not signifi- 
cantly different from each other in mean scores. All freshman groups 
were reliably more favorable in attitude toward evolution than the com- 
mercial group. 


206 Key L. Barkley 


Table 5 


Showing the Differences Between Mean Scores of the Various Groups of Students in 
the Fall and in the Spring,* and the Reliability Indices 
of these Differences ** 


Biology and 
Biology Chemistry Mathematics Chemistry 
-D. xD. B E 
D саш. р саш. р саш. р сан. 
Fall 27 1.36 
Biology 
Spring 60 3.03 
Fall 40 2.12 38 2.21 67 3.58 
Chemistry 
! Spring 95 1.79 ПИГ 1:06 .95 5.49 
Fall 102). STE 29 159 
Mathematics 
Spring 18 .96 48 4478 
Fall Л0 3.85 30' 178 68 415 97 5.39 
Commercial 
Spring 140 6.14 105 5.10 122 6.16 2.00 9.60 


+ Groups named at the top made the higher scores as compared with the ones named 
at the left. 


_ У“ The formula for uncorrelated measures was used in calculating the critical ratios 
in this table. 


These results appear to indicate that the successful study of any gen- 
eral science or mathematies course by freshmen tends to promote change 
to a more favorable attitude toward evolution. There is a slight sugges- 
tion that the laboratory sciences may be more effective than mathematics, 
possibly because of containing more material directly related to evolution 
and a greater emphasis upon scientific approach in doing the actual lab- 
boratory exercises. 

It should be noted that the group who took both biology and chemistry 
at the same time in college had the highest average score found in any 
group at both testings. This group was composed of students who plan- 
ned to major in either biology or chemistry; they had had more science 
and mathematics in high school than the others ; and they made more 
progress in the study of science in general during the first year in college 
It would appear, therefore, that favorable attitude toward evolution 18 
associated with a special interest in science and with progress in the study 
of science. ' 


College Science Courses and Evolution 207 


Summary and Conclusions 


College freshmen taking a course in introductory biology, chemistry, 
mathematies, or both biology and chemistry were given the Thurstone 
Scale on Attitude Toward Evolution at the beginning and at the end of 
these full year courses. A group of commercial students, who had a year 
of work in the same college without any science courses being included in 
their curriculum, was tested in the same manner. 

Among the freshmen only those who took both biology and chemistry 
were significantly different from any of the others in attitude toward 
evolution when first tested; those taking both science courses were more 
favorable in attitude than those who elected chemistry only. Regular 
college freshmen, however, were significantly more favorable in attitude 
toward evolution than the commercial students, except in the case of 
those who elected chemistry as their freshman science course. 

During the year, all freshmen students of science changed to a more 
favorable attitude toward evolution, while the change noted in the com- 
mercial students was insignificant. Those groups taking biology or 
chemistry alone and the one taking both biology and chemistry made 
Statistically reliable changes. The difference in degree of change be- 
tween freshmen students studying the various sciences was not statis- 
tically reliable. The difference in degree of change between the com- 
mercial group and the students of science had a critical ratio of 2 plus in 
all cases and was definitely reliable between the commercial group and 
~ freshman group which took both biology and chemistry (critical ratio 
of 3.75). 

At the second testing, all groups of students who had studied science ` 
were reliably more favorable in attitude toward evolution than the com- 
mercial students. Moreover, those students who had studied both bi- 
ology and chemistry were reliably more favorable in attitude toward evo- 
lution than were those other freshmen who had studied only biology, or 
chemistry, or mathematics. 

Several conclusions may be drawn from the findings: 


1. The characteristic attitude of the regular college freshman, at the 
college where the study was made, tends to be one of neutrality or doubt 
respecting evolution. Those who elect the one year commercial course 
tend to be prejudiced against evolution. 

2. Study of courses in science and mathematics tends to promote the 
development of a more favorable attitude toward evolution. The study 
of biology or chemistry alone or a combination of two sciences (biology 
and chemistry) appeared to have sufficient influence to facilitate a change 
In attitude which was shown to be significant by the most rigorous statis- 
„tical criterion, Moreover, none of the various science courses considered 


208 Key L. Barkley 


in this investigation was significantly more effective than the others in 

promoting changes in attitude toward evolution. y 
3. Upperelassmen in the groups studied here tend to ђе favoi 

in attitude toward evolution, but they showed very little change in. 


anatomy. Their attitude appears to depend upon prior studies and other 
influences. 
4. Study in a curriculum which did not contain any science cour 
tended to leave the student with his attitude toward evolution unch 
5. Special interests in science, as indicated by election of more 


sociated with a favorable attitude toward evolution and is also 8 
panied by a reliably greater development toward a still more favoi 
attitude than is noted in the case of those who are non-majors and 
only one course in science as freshmen, or who study in a comm 
curriculum not containing science courses. Likewise, a more favorab 
attitude toward evolution is associated with more study of and progre 
in science as indicated by a devotion of a greater amount of time to ma 
matics and science in high school. i 
6. Probably these findings cannot be too liberally generalized, but 
indicative of the conditions and developments in the college where t 
study was made. 
Received June 24, 1947. 
References v 

Barkley, Key L. Relative influence of commercial and liberal arts curricula upoi 

changes in students’ attitudes. J. soc. Psychol., 1942, 15, 129-144. 
vci B. Attitudes of undergraduate students. J. soc. Psychol., 1 ite 
Corey, S. M. Changes in the opinions of female students after a year at the uni 

sity. J. soc. Psychol., 1940, 11, 341-351. 
Dudycha, G. J. The belief of college students concerning evolution. J. 

Psychol., 1984, 18, 85-96. 
BN attitude of college students toward war. J. soc. Psychol., 1942, 


m 


noa p 


6. Katz, D., and Allport, F. H. Students’ attitudes. Syracuse: Craftsman Press, 1' 
7. McCreery, Otis Claire. The influence of college residence on attitude. 8ш 
of Ph.D. Theses, University of Minnesota, Vol. II, 1939, pp. 123-127. мё 

8. nM ae T. Student attitudes toward war. Sociol. and Soc. Res., 1986, 

9. p A.C. A quantitative study of social attitudes. School Rev., 1935, 

10. Smith, M. Spontaneous change of attitude toward communism. School and 
1940, 51, 684-688. " 

11. Sowards, G. S. A study of the war attitudes of college students. J. abn. 
Psychol., 1934, 29, 328-333. 

12. Thurston, L. L., and Chave, E. J. The measurement of attitude. Chicago: 
sity of Chicago Press, 1929. 

13. Thurstone, Т. G. Altitude toward evolution, Scale 30. Chicago: University 
Chicago Press, 1931. 


Book Reviews 


Lytle, Charles Walter. Job evaluation methods. New York: The Ronald 

Press Company, 1946. Pp. 329. $6.00. 

Job evaluation methods is essentially a description, with illustrations, 
of the steps involved in the development of job evaluation plans, and ог 
the various ways in which these steps may be accomplished. 

The description of methodology is preceded by a review of the factors 
that have focused more serious attention on job evaluation in the past 
several years, a summary of the purposes of job evaluation, a discussion 
of the integration of job evaluation with other management functions, 
especially job control, a reminder of the importance of clearly defined 
personnel policies as they relate to job evaluation, and a review of per- 
tinent organizational and administrative considerations. 

The author makes no pretense of introducing any new basic methods 
or techniques; rather, he has sought to present in a single package various 
fundamental and detailed prevailing practices. The book is a distinctly 
significant contribution toward this objective not only because of the 
comprehensive and careful treatment of methodology as such, but also 
because of the generally adequate analysis of various practices, the in- 
clusion of words of warning on various aspects, and the prevailing em- 
phasis on practical considerations. Illustrative of the superior treatment 
of the subject matter, for example, are the discussions of the influence on 
the character of the trend line of the use of arithmetic versus geometric 
Progression in the assignment of degree allotments, and the attention 
devoted to the building of the rate structure. 

It isto be wished, however, that the thoroughness that is generally 
characteristic of the book had been extended to include appropriate treat- 
ment of certain of the more theoretical underpins of job evaluation which 
have significant practical implications. In connection with the selection 
of job factors, for example, considerable attention is devoted to a review 
of various plans and the crystalization of a standard pattern of job factors; 
there is, however, no reference to the possible use of statistical techniques 
Such as factor analysis for revealing inter-relationships among job factors 
and as an aid in identifying the most distinctly independent factors for 
Melusion in job evaluation plans. Similiarly, while there is a presentation 
of both the extreme and the more typical weightings of various factors, 
"еге is no suggestion of the possible statistical determination of weight- 
mgs which most nearly reflect the relative intrinsic economie worth of 

209 


210 Book Reviews 


the various factors as determined by the dynamics of economic and socis 
forces. In addition, the treatment of the statistical reliability of jol 
ratings seems much too superficial in the light of its importance. 
Aside from these blind spots, however, and an occasional more trivi 
transgression, the book can be considered as a distinctly effective v 
lation system for the hazy atmosphere that characterizes some сштеп 
thinking on job evaluation, and can be recommended for use as a uniquely 
adequate text or manual on job evaluation. 


Division of Applied Psychology, 
Purdue University 


Тће Society for the Advancement of Management, New York Chap 
1945 Conference Proceedings: Selection of sales personnel and aptitud 
testing. New York: Sutton-Malkames Co., Inc., 1946. Pp. xiii + 
137. $4.00. \ 1 
This is a report of a conference on sales personnel and aptitude hel 

by the New York Chapter of the Society for the Advancement of Ma 

ment. These proceedings comprise speeches concerning (1) the scie 

basis of psychological tests, (2) aptitude testing in transition—from p 

duction to selection of salesmen, (3) the sales manpower developmi 

program of the General Electrie Co., and (4) the use of aptitude 
аз a management tool. A panel discussion follows the speeches. 

material presented is quite general and contains little factual data v 

would be of assistance in guiding a testing program in industry. 8 

of the panel discussion, and especially the paper by W. Н. Wulfeck on: 

scientific basis of psychological tests might be of interest to those in’ 

field of testing and guidance. 4 

Robert M. Thomson ' 


С. Н. Lawshe, Је, 


University of Minnesota 


Lazarsfeld, Paul F., and Field, Harry. The people look at radio. 
Hill: The University of North Carolina Press, 1946, Pp. ix andi 5 
$2.50. М 
The National Association of Broadcasters commissioned the Nation: 

Opinion Research Center to conduct a publie opinion study of attitud 

toward radio. Then Columbia University's Bureau of Applied Soci: 

Research was called in to interpret the results and prepare the re 
The principal survey was based on a national sample of 2571 men а 

women with an extended sample of 672 respondents in the Mo 

and Pacific time zones used for geographical breakdowns. In add 
two supplementary surveys, involving 498 and 1091 radio listeners, 
conducted. 


Book Reviews 211 


The first chapter presents an overall appraisal of radio as an insti- 
tution. ‘The second chapter reports the results on attitudes toward radio 
advertising, and the third chapter covers program preferences with em- 
phasis on types of programs rather than specific radioshows. The fourth 
chapter gives an analysis of certain industry problems: the educational 
level of critics of radio, the educational value of radio, attitudes toward 
selling or donating radio time for certain purposes, fairness in handling 
controversial issues, and the role of government in the operation of radio 
stations. 

Detailed results are quoted only when they are essential. Sources 
beyond this particular study are included when necessary, and the reader 
is given the advantage of the writer’s background of experience. This 
approach results in a report that runs smoothly and does not get bogged 
down by unnecessary detail. The appendices report the complete re- 
sults: the characteristics of the sample, the questions asked and tabu- 
lations of the responses, the characteristics of the supplementary samples, 
and certain special topics such as an analysis by levels of severity of criti- 
cism and an analysis of program preferences. Thus all the detail is in- 
cluded but it does not interrupt the main line of presentation. 

One common error in many public opinion studies is interpreting the 
results as if they were absolutes when they are merely relatives. This 
report is an outstanding exception. The overall appraisal of radio, for 
example, is put on a relative basis by comparing radio with churches, 
newspapers, schools, and local government. Again, the conclusion that 
about one-third of radio listeners have a negative attitude toward radio 
advertising is accepted only after several different approaches produced 
about the same results. Skillful cross-tabulation and group analysis are 
used to give meaning to various attitude categories. The responses of 
people who reported they “feel like criticizing when they listen to the 
radio” become meaningful when it is shown that affirmative answers are 
positively related to the amount of listening. The difference between 
“not minding” and “putting up with” radio advertising is given practical 
meaning by showing the relation to the question on whether one would 
prefer radio without advertising. The only example of reporting results 
as absolutes is a minor one: quoting the average number of hours of listen- 
ing per day when there is some question of the validity of the results when 
used for anything except division into groups for relative comparisons. 

There is every indication that the report is written from a practical 
viewpoint. In one analysis the objections of lenient critics are compared 
with those of more severe critics to find what can be done to win over 
the group that can be influenced most easily. Surely this represents а 
· practical approach. Throughout the report there is full recognition of 


212 Book Reviews 


the practical problem of taking into consideration factors other than 
publie opinion in evaluating radio. 

Probably most important of all is the impression that this was a 
completely honest project from start to finish. There is no indication of 
pointing either the questions or the interpretation in a direction favorable 
to radio. Obviously, everyone involved wanted to know the true situs- 
tion regardless of whether it was favorable or unfavorable. Some people 
may object to the use of quota sampling, some people may object to an 
approach that is more like the single-question method than the uni- 
dimensional attitude scale technique; but the reviewer doubts that any- 
one would have any basis for questioning the honesty of the operation. 

Alfred C. Welch 

Knoz Reeves Advertising, Inc., 

Minneapolis, Minnesota 
Davis, Fred B., Item-Analysis data, their computation, interpretation, and 
use in test construction. Harvard University, Cambridge, Mass., 

1946. Рр. уі + 42 and chart. $.75. 

The extensive literature and the number of computationally simple 
techniques of item analysis developed over the past fifteen years make 
questionable the need for а new method. If the need be granted, how- 
ever, this bulletin presents a sound and accurate summary of the under- 
lying logic and a method which should prove useful. The chart simpli- 
fies the conversion of data into the two basic indices,—difficulty and 
discrimination. 

The possibility of using external as well as internal criteria is 
recognized; unfortunately, the importance of external criteria is under- 
emphasized. There is, moreover, a healthy appreciation of the often- 
overlooked fact that item analysis contributes merely one line of useful 
information about the items at hand; it cannot substitute for ingenuity 
and skill in item construction nor is it more than an aid to judgment in 
revising or selecting items. 

Difficulty is measured on a linear scale with correction for "chance 
success" and for failure to attempt items at the end of the test. The as- 
sumptions involved in both corrections are clearly stated, but even this 
statement is not likely to prevent their being overlooked in application. 
The assumptions involved in estimating difficulty from only the tails of 
the distribution (upper and lower 27%) are omitted, The percentage of 
correct answers among total attempts is converted to a linear scale, using 
the normal probability integral, ranging from 1 to 99, The reliability of 
these indices based upon 100 cases in each tail (370 test papers altogether) 
is reported as about .98. In the reviewer’s experience, values as high a8 
this, even for comparable samples, are unusual. 

Difficulty and discrimination are nicely distinguisl\ed, and the effect 


> 
ү 


p 


Book Reviews 213 


of difficulty on certain widely-used diserimination indices is noted. 
The limitations of the critical ratio as an index of discrimination are 
particularly well stated. 

The discrimination index used is Fisher's z function based on what are 
essentially tetrachoric r's. These z-values are then transmuted to a scale 
from 0 to 100. The use of z rather than т in item analysis is a new re- 
finement; how necessary a refinement is not clear. The indices are based 
upon the responses of the 27% tails, following Kelley's demonstration 
that, for certain specified, and highly atypical conditions, this choice of 
groups is optimum. The practical advantage of the 27% groups over 
the simpler 25%, has never been apparent to the reviewer. 

The interpretation of these difficulty and discrimination indices is 
hindered by the use of scales which assign different meanings to the nu- 
merical values used in the more familiar indices of percentage and cor- 
relation coefficient. Linearity could have been obtained without this 
possibility of confusion. 

The dependence of both indices on many factors, and their relation 
to expert review of items and careful sampling of the field are nicely 
pointed out. Greater emphasis should be given to the fact that neither 
index measures an attribute inherent in the item, but rather relationships 
among the item, the testees, and (with internal criteria) the remainder of 
the items. Davis does point out that validity may be drastically modified 
by successive selection of items on internal criteria alone. : 

Other writers in the field are not always appropriately recognized in 
the text. For example, although Richardson's work on item difficulty 
and test validity is cited, there is no mention of T. С. Thurstone or of 
Wherry & Gaylord. 

Тће most serious omission is the complete lack of any reference to 
cross-validation or to the need for it. The treatment implies that items 
selected on the basis of one sample and applied to another similar sample 
will retain all of their virtues. Item analysis capitalizes on any peculi- 
arities in the sample; thus a recheck on at least one other sample is neces- 
вату to give confidence that the findings characterize the population 
rather than the sample alone. Moreover, although a powerful use of 
item analysis employs two or more criteria, selecting items correlating 
high with one and low with another as a means of constructing relatively 
uncorrelated measures, is not mentioned as a possibility. 

Despite these omissions, this bulletin has a definite value in summa- 
rizing the uses and limitations of item analysis techniques and in pre- 
senting them as tools to be used with judgment rather than as a machine 
into which one can pour data, then turn a crank and extract a finished test. 

Charles I. Mosier 

Personnel Research Section, A.G.O., 

Washington, D. C. 


214 Book Reviews 


Westover, Frederick L. Controlled eye movements versus practice exercises 
in reading. New York: Teachers College, Columbia University Con- 
tributions to Education, 1946, No. 917. Pp. 99. $1.95. 


The reading clinician is confronted with the problem of how much use 
is to be made of mechanical devices in his remedial work. There are now 
several of these devices available for improving reading by controlling 
eye movements. Their use, which has become widespread, is based upon 
the assumption that the pacing of eye movements will improve reading 
proficiency by correcting faulty oculomotor habits. These techniques, 
unfortunately, emphasize peripheral factors as fundamental in causing 
reading disability. But the use of pacing techniques, with or without the 
aid of mechanical devices, have quite uniformly produced improvement 
in the rate of reading. Although a few studeis have been concerned with 
the effectiveness of pacing techniques in comparison with other methods 
of improving reading, more evidence is needed. This report compares 
three methods of improving the reading speed and comprehension of 
college freshmen: (1) college work with no special exercises although 
members of this (the control) group as well as the other groups were moti- 
vated by informing them that their reading ability was poor, that they 
needed to do better to handle college work and that improvement was pos- 
sible; (2) college work with practice in special reading exercises; (3) college 
work with practice in reading the same special exercises under conditions 
of controlled eye movements. This was achieved by means of a special 
apparatus which forced pacing of the eye movements. 

The two experimental groups gained significantly more than the con- 
trol group, but there were no significant differences in the effectiveness 
of the two instructional methods (reading exercises vs. pacing). The 
finding that mechanical pacing of eye movements yields results no differ- 
ent from that achieved by the use of reading exericses is highly important. 
It gives additional experimental support to the view frequently expressed 
by the reviewer that just as satisfactory gains in reading may be achieved 
without the use of pacing techniques. When pacing techniques are em- 
ployed, the teacher or the clinical worker is too prone to overemphasize 
peripheral factors in reading and to neglect the more fundamental central 
factors of perception, comprehension and assimilation. 

This investigation was adequately designed and the data skillfully 
analyzed. The author is well aware of certain limitations of the investiga- 
tion, i.e., the training period was rather short; the *control group" was not 
а real control group since its members were motivated by special in- 
structions to improve their reading. This excellent study will be well 
received by those interested in the experimental study of reading, but will 


Book Reviews 215 


be ignored by those teachers and clinieians who love their mechanical 
gadgets. 
Miles A. Tinker 


University of Minnesota 


Cantor, Nathaniel. Dynamics of learning. Buffalo, Ne Y.: Foster and 
Stewart, 1946. Pp. x + 282. $3.00. 


In this book the chairman of the Department of Anthropology and 
Sociology of the University of Buffalo describes his own experience in the 
teaching of courses in personality and culture, and in criminology, by a 
discussion method which departs radically from the usual college lecture. 
Most college teachers in the social sciences and humanities will accept 
the diagnosis of the learning situation as Professor Cantor describes it, 
but few of them will have had his courage to adapt their instructional 
methods to the implications of this diagnosis. 

The American school system is charged with being authoritarian, sup- 
porting the status-quo, and rewarding individual rather than cooperative 
effort. If these charges be true, it is little wonder that our schools have 
been little successful in remolding student attitudes in ways required by 
democracy. To become democratic, one must live democratically. Un- 
less students are permitted to express themselves in the classroom and to 
learn there to respect differences of expression, Professor Cantor asks: 
“Why should they be expected as adults to believe in and make sacrifices 
for a democratic way of life?" 

As the title of the book implies, the justification for the teaching pro- 
cedure illustrated and advocated rests upon clinical psychology. Stu- 
dent statements stenographically reported from class discussions, and ех- 
cerpts from papers handed in, show how mechanisms such as resistance, 
ambivalence, projection, and identification are found in the interplay 
among students, instructor, and textbook author. The adopted method, 
Which recognizes that the student must be given responsibility if he is to 
develop to an independent, critical, but tolerant position, has much in 
common with Rogers' non-directive therapy. by 

All the contemporary discussion in our colleges and universities about 
revised courses of study and the re-evaluation of educational objectives 
will be empty unless teachers come to know more about how students 
learn, and begin to appraise the development of students in ways not 
shown in the usual course examination. The testimony from students 
who have studied with Professor Cantor gives strong support to these 
contentions. 

While the book is not written for the professional psychologist, it 
should give pause to the psychologist who is also a teacher. Upon read- 


216 Book Reviews 


ing it he cannot help but ask himself, Am I really teaching in a way con- 
sonant with what I know about the psychology of personality, mental 
hygiene, and attitude formation? Or am I teaching in the conventional 
ways in which I was taught with no more than tradition to justify my 
practices? If he should choose to teach in newer ways, he would find 
encouragement and instruction from this account of the success which 
Professor Cantor has had in the use of a freer method. 


Ernest R. Hilgard 
Stanford University 


Tyler, Leona E. The psychology of human differences. D. Appleton- 

Century Co., Inc. New York, 1947, pp. XIII + 420. $3.75. 

This is a textbook designed for use in courses on individual differences, 
differential psychology, or human variability. It is intended to meet the 
needs both of the general student and of the undergraduate major in 
psychology. Such a book, according to the author, should give up-to- 
date information, present the facts so that they can be readily assimilated, 
and show students how to avoid wrong conclusions. The stated purpose 
is "to synthesize and reconcile opposing points of view rather than to 
perpetuate old arguments . . ., to sort out the findings which stand up 
under critical statistical analysis from those which are in error or am- 
biguous, and to separate actual results f. rom interpretations.” 

Dr. Tyler has performed her task admirably. The book is a model 
of clear exposition and critical appraisal of reported data. That it is 
up-to-date is indicated by the fact that 56 per cent of the 260 references 
cited are dated later than 1936. Some may even feel that the author has 
given too little attention to the historical background of recent research 
in certain of the fields covered. The index contains only two references 
to E. L. Thorndike, three to Stern, and none to Meumann. There are, 
however, seven references to Galton, six to J. Cattell, six to Spearman, 
and four to Binet. 

Apart from the introductory and concluding chapters, the topics 
covered include the nature and extent of differences, methods and logic, 
sex differences, race and nationality differences, class differences, age 
differences, mental deficiency, genius, the effects of practice upon differ- 
ences, heredity and environment, measurement of aptitudes, and the 
search for basic traits. The space devoted to each of these topics is in 
most cases between 25 and 35 pages, including chapter references and 
(for some chapters) problems listed for practice. 

Although on the whole the book is fairly inclusive and the allotment 
of space judicious, there are a few omissions which this reviewer believes 

should be made good when a second edition is called for. For example, 


E 


Book Reviews 217 


there is little discussion of differences revealed by the vast literature on 
personality tests and character tests. There is nothing on problem 
children, and delinquency is mentioned only in reference to sex differences 
in incidence. А chapter on differences in achievement scores among 
pupils in given school grades would have added considerably to the value 
of the book for prospective teachers and school administrators, as would 
also a chapter on differences in scholastic aptitudes at the high school 
and college levels. No mention is made of the enormous differences in 
scholastic achievement found by Learned and Wood. The chapter on 
age differences is confined entirely to adult subjects, without mention of 
age differences in childhood or the overlap of successive age groups in 
the earlier years. The chapter on sex differences is also confined almost 
entirely to adults. Physical differences are dealt with chiefly in their 
relation to mental traits while differences in rate of physical maturation 
are ignored. No reference is made to Gesell’s researches in develop- 
mental phenomena. 

The reviewer hopes he does not seem unduly critical in calling atten- 
tion to what he regards as gaps in a textbook that is so outstanding in 
its merits. The gaps can be filled in later. More important is the book’s 
superb quality, which is evidenced throughout in its organization, its 
exceptional readability, its critical handling of controversial issues, and 
its effectiveness in acquainting the reader with pitfalls in the interpreta- 
tion of data on human differences. Students will find it interesting 


and challenging. Lewis M. Т 
wis M. Terman 


Stanford, California 


New Books, Monographs, and Pamphlets 


Books, monographs, and pamphlets for listing and possible review should be sent to 
Donald G. Paterson, Editor, Department of Psychology, University 
of Minnesota, Minneapolis 14, Minnesota 


Jobs for women over 35. Julietta К. Arthur. New York: Prentice-Hall, 
Inc., 1947. Рр. 250. $3.50. 

Fatigue and impairment in тап. S. Howard Bartley and Eloise Chute. 
New York: MeGraw-Hill Book Co., Inc., 1947. Pp. 429. $5.50. 

Basic guidance. Ralph C. Bedell, Editor. Lincoln, Neb.: State Depart- 
ment of Vocational Education, 1947. Рр. 70. $1.00. 

National advertising in newspapers. Neil H. Borden, Malcolm D. Taylor, 
and Howard T. Hovde. Boston: Division of Research, Harvard Busi- 
ness School, 1946. Pp. 486. $5.00. 

Fair thought and speech. Carl F. Braun. Alhambra, Calif.: C. F. Braun 
and Co, 1947. Pp.50. $125 single. $1.00 twelve or more. 

Men at work. Keeve Brodman. Chicago; Cloud, Inc., 1947. Рр. 191. 

' $2.50. 

Human relations in the classroom. Course I. H. Edmund Bullis and 
Emily E. O'Malley. Wilmington, Del.: The Delaware State Society 
for Mental Hygiene, 1947. Pp.222. $3.00. 

The psychology of behavior disorders. Norman Cameron. Boston: 
Houghton Mifflin Co., 1947. Рр. 622. $5.00. 

Experimental designs in sociological research. Stuart Chapin. New 
York: Harper and Brothers, 1947. Рр. 197. $3.00. 

The use of tests in college. J. С. Darley et al. Washington, D. C.: 
American Council on Education, 1947. Рр. 82. $1.00. 

Telepathy and medical psychology. Jan Ehrenwald. New York: W. W. 
Norton and Co., Inc., 1948. Pp. 212. $3.00. 

Race and nationality: as factors in American life. Henry P. Fairchild. 
New York: The Ronald Press Co., 1947. Pp. 240. $3.00. 


Study your way through school. С. ФА, Gerken. Chicago: Science Re- 
search Associates, 1947. Р 


p. 48. Twenty copies, $12.00. Single 
copy, $.75. 
Developmental diagnosis. Second edition. Arnold Gesell and Catherine 


2 Amatruda, New York: Paul B. Hoeber, Inc., 1947. Pp. 516. 
.50. 


Child offenders. Harriet L. Goldberg. New York: Grune and Stratton, 
1947. Pp. 230. $4.00. 


218 


New Books, Monographs, and Pamphlets 219 


Educational lessons from wartime training: general report of the commission. 
Alonzo G. Grace et al. Washington, D. C.: American Council on 
Education, 1947. Pp. 200. 

Patterns of union-management relations. Frederick H. Harbison and 
Robert Dubin. Chicago: Science Research Associates, 1947. Pp. 
240. $3.75. 

Training in clinical psychology. Molly R. Harrower, Editor. New 
York: Josiah Macy, Jr. Foundation, 1947. Pp. 88. $1.50. 

Guide to guidance. Volume IX. М. Eunice Hilton, Editor. New York: 
Syracuse University Press, 1947. Рр. 58. $1.00. 

The armed services and adult education. Cyril O. Houle et al. Washing- 
ton, D. C.: American Council on Education, 1947. Рр.257. $3.00. 

Fundamentals of statistics. "Truman L. Kelley. Cambridge: Harvard 
University Press, 1947. $10.00. 

Sexual behavior in the human male. Alfred C. Kinsey, Wardell B. Pome- 
roy, and Clyde E. Martin. Philadelphia: W. B. Saunders Company, 
1948. Pp. 804. $6.50. 

Industrial and labor relations review. Milton R. Konvitz, Editor. Ithaca: 
Cornell University Press, 1947, Annual subscription, $3.00. 

The blind preschool child. Berthold Lowenfeld, Editor. New York: 
American Foundation for the Blind, Inc., 1948. Pp. 148. $2.00. 

Doctor Freud. Emil Ludwig. New York: Hellman, Williams and Co., 
1947. Рр. 317. $3.00. 

Some notes on the psychology of Pierre Janet. Elton Mayo. Cambridge: 
Harvard University Press, 1948. Pp. 132. $2.50. ; 
Physiological and psychological factors in sex behavior. Roy W. Miner, 
Editor. New York: New York Academy of Sciences, 1947. Pp. 61. 

$1.25. 

Job evaluation. Jay L. Otis and Richard H. Leukart. New York: 
Prentice-Hall, Inc., 1947. Pp. 473. $5.00. 

Abnormal psychology. James D. Page. New York: McGraw-Hill Book 
Col, Inc. 1947. Рр. 450. $4.00. Н 

Marketing by manufacturers. Charles F. Phillips, Editor. Chicago: 
Richard D. Irwin, Inc., 1947. Pp. 039. $6.00. | 

Modern clinical psychology. T. W. Richards. New York: McGraw-Hill 
Book Co. Ine, 1947. Pp. 325. $3.50. 

Youth, marriage and parenthood. Lemo D. Rickwood and Mary Ford. 
New York: John Wiley and Sons, Inc., 1945. Pp. 298. $3.50. 

Work and effort. Thomas A. Ryan. New York: The Ronald Press Co., 
1947. Pp. 323. $4.50. 

Labor relations and human relations. Benjamin M. Selekman. New 
York: McGraw-Hill Book Co., 1947. Pp. 255. $3.00. 


220 New Books, Monographs, and Pamphlets 


Problems of early infancy. Milton J. E. Senn, Editor. New York: Јов 
Macy, Jr. Foundation, 1947. Pp.70. $75. р 

Workbook and manual in psychology. С. Milton Smith. New Үс 
Henry Holt and Co., 1947. Рр. 213. $1.40. 

The behavior cards. Second edition. Ralph M. Stogdill. New Yor 
The Psychological Corporation, 1947. Рр. 18. 

Psychopathology and education of the brain-injured child. Alfred . 
Strauss and Laura E. Lehtinen. New York: Grune and Strat 
1947. Pp.270. $5.00. E 

Thematic appreciation test. Silvan S. Tompkins. New York: Grune: 
Stratton, 1947. Pp. 320. $5.00. 3 

Effective personality building. Gwenyth R. Vaughn and Charles Ro 
New York: McGraw-Hill Book Co., Inc., 1947. Pp. 290. 

Practical handbook for group guidance. Barbara Wright. Chicago 
Science Research Associates, 1947. Pp. 225. $3.00. 

The place of psychology in an ideal university. Report of the Нагуаг 
Commission. Cambridge: Harvard University Press, 1947. Рр. 4 
$1.50. Discount for quantities of ten. - J 


Journal of Applied Psychology 


Vol. 32, No. 3 June, 1948 


А New Readability Yardstick * 


Rudolf Flesch 
Dobbs Ferry, N. Y. 


In 1943 the writer developed a statistical formula for the objective 
measurement of readability (comprehension difficulty) (5, 6). The for- 
mula was based on a count of three language elements: average sentence 
length in words, number of affixes, and number of references to people. 
Since its publication, the formula has been put to use in a wide variety of 
fields. For example, it has been applied to newspaper reports (9, 20), 
advertising copy (1), government publications (19), bulletins and leaflets 
for farmers (3), materials for adult education (4), and children's books 
(12). Its validity has been reaffirmed by five independent studies: the 
formula ratings of psychology textbooks substantially agreed with ratings 
by students and teachers (17); the formula scores rated specially edited 
radio news, newsmagazine, and Sunday news-summary copy “more read- 
able” than comparable newspaper reports (18); advertisements, rated 
"more readable" by the formula, showed higher readership figures (7); and 
articles that were simplified with the aid of the formula brought increased 
readership in two successive split-run tests (13, 14). Since 1943, a num- 
ber of academic institutions have incorporated the formula in the curricu- 
lum of courses in composition, creative writing, journalism, and adver- 
tising; it has also been used as the basis of several graduate research 
projects. 

Because of this wide application, it seemed worthwhile to re-examine 
the formula and to analyze its shortcomings. One of these is to be traced 
to the basic structure of the formula; others are the results of difficulties 
Ш its application. à 

The structural shortcoming of the formula is the fact that it does not 
Always show the high readability of direct, conversational writing. For 

“Samples from the main body of this paper, when tested for readability by the 
method here proposed, had an average “reading ease” score of 30 and a “human interest’? 
gore of 0. Presumably, the paper is easier to read than most other articles appearing 
mn scientific journals. The section, “The Formulas Restated,” which contains directions 
for users of the formulas, has a “reading ease” score of 79 and a “human interest 
Score of 42—which puts that portion of the article in the class of a good cookbook. 

221 


222 Rudolf Flesch 


example, in the study of psychology texts mentioned above (17), the score 
of Koffka's Principles of gestalt psychology (“the students’ choice for un- 
readability”) was 5.4 ("difficult"); yet William James’ Principles of 
psychology, a classic example of readability, rated 6.0 (bordering on “very 
difficult"), Similarly, the formula consistently rates the popular Reader's 
Digest more readable than the sophisticated New Yorker magazine, al- 
though many educated readers consider the Reader’s Digest dull and the 
sprightly New Yorker ten times as readable. 

Aside from that, the practical application of the formula led to several 
minor misinterpretations. Sentence length, for instance, is the element 
with the heaviest weight; it is also the easiest to measure. As a result, 
this feature of the formula is often overemphasized, sometimes to the 
exclusion of the others—as in the directives that have been issued to staff 
writers of the Associated Press and the New York Times, recommending 
the use of shorter sentences in “leads.” On the other hand, the second 
element—number of affixes—seems often difficult to apply; users of the 
formula found this count particularly tedious and admitted to uncer- 
tainty in spotting affixes. The third element—references to people— 
raised no such questions; but it was sometimes felt to be arbitrary and the 
underlying principle was often misunderstood. 

In addition, many people found it hard to get used to the scoring 
system, which generally ranges from 0 (‘very easy") to 7 (“very dif- 
ficult”). Also, the average time needed to test a 100-word sample is six 
minutes (4). This makes the application of the formula considerably 
faster than that of earlier formulas, which required reference to word 
lists (e.g. Gray-Leary (8) or Lorge (10)), but it is still too long for prac- 
tical use. ‘ 

The revision of the formula presented in this paper is an attempt 10 
overcome these shortcomings and make the formula a more useful in- 
strument. 


Procedure 


The criterion used in the original formula was McCall-Crabbs’ Stand- 
ard test lessons in reading (11). The formula was so constructed that it 
predicted the average grade level of a child who could answer correctly 
three-quarters of the test questions asked about a given passage. Its 
multiple correlation coefficient was R = .74. It was partly based on 
statistical findings established in an earlier study by Lorge (10). 

For many obvious reasons, the grade level of children answering test 
questions is not the best criterion for general readability. Data about 
the ease and interests with which adults will read selected passages woul 
be far better. But such data were not available at the time the first 
formula was developed, and they are still unavailable today. So McCall- 


n 


A New Readability Yardstick 223 


Crabbs’ Standard test lessons are still the best and most extensive criterion 
that can be found; therefore they were used again for the revision. 
In reanalyzing the test passages, the following elements were used: 


(1) Average Sentence Length in Words. The same element was used 
in the previous formula, but the correlation coefficient used was taken 
from Lorge's earlier findings. In the present study this coefficient was 
recomputed. 

(2) Average word length in syllables, expressed as the number of syl- 
lables per 100 words. The hypothesis was that this measure would 
furnish results similar to the affix count in the earlier formula. Syllables 
are obviously easier to count than affixes since this work can be reduced 
to а mechanical routine. 

(3) Average Percentage of “Personal Words." The same element was 
used in the earlier formula. However, the opportunity was used to test 
а clarified definition, which made no significant difference in correlation. 
The new definition was stated as follows: All nouns with natural gender; 
all pronouns except neuter pronouns; and the words people (used with the 
plural verb) and folks. 

(4) Average Percentage of “Personal Sentences." This new element 
was designed to correct the structural shortcoming of the earlier formula, 
mentioned above. By hypothesis, it tests the conversational quality and 
the story interest of the passage analyzed. It was defined as the per- 
centage of the following sentences: Spoken sentences, marked by quota- 

' tion marks or otherwise; questions, commands, requests, and other 
sentences directly addressed to the reader; exclamations; and grammati- 
cally incomplete sentences whose meaning has to be inferred from the 
context. 


To make the prediction more accurate, 13 of the 376 McCall-Crabbs’ 
Passages that contained poetry or problems in arithmetic were omitted 
in the count of the first two elements, which are designed to test solely 
Prose comprehension. However, these 13 passages were retained in the 
Count of the last two elements, which are designed to test human interest. 

Following the procedure in the earlier study, intercorrelations were 
then computed. However, multiple correlation of the four elements 
with the criterion showed no significant gain in prediction value over 
the earlier formula in spite of the significant prediction value of the addi- 
tional fourth element by itself (r = — .27). Therefore, two multiple- 
correlation regression formulas were computed: one using the first two 
elements and one using the last two. This procedure had the advantage 
of giving independent predictions of the reading ease and the human in- 
terest of {afgiven passage. 1 


224 Rudolf Flesch 


Finally, the resulting twin formulas were expressed in such a way that 
maximum readability (in both formulas) had a value of 100, and minimum 
readability a value of 0. This was done to make the scores more readily 
understandable for the practical user. 


Findings 
The intereorrelations, means, standard deviations, and regression 


weights found are shown in Tables 1, 2, and 3. The following symbols 
were used: wl for word length (syllables per 100 words), sl for sentence 


Table 1 
Correlations, Means, Standard Deviations, and Regression Weights 
of Word and Sentence Length 
sl Co э s B 
wl 4644 6648 134.2208 13.6845 .5422 
sl — 5157" 16.5213 5.5509 .2639 


* After the preparation of this paper two articles appeared that pointed out а сош- 
putational error affecting the writer's original formula (Dale, E. and Chall, Jeanne 8. 
A formula for predicting readability. Educ. Res. Bull., Ohio St. Univ., 1948, 27, 11-20, 
28; Lorge, I. The Lorge and Flesch readability formulae: a correction. Sch. & 800, 
1948, 67, 141-142). The error concerned the correlation coefficient between sentence 
length and the criterion, which had originally been reported by Lorge as .6174; the 


writer, acknowledging his debt to Lorge, used that figure without recomputation. The. 


corrected correlation coefficient is now reported as .4681 by Dale and Chall, and as 
467 by Lorge; this corresponds closely to the figure of .5157 reported in Table 1, con- 
sidering the fact that the writer now used a slightly better criterion of 363 passages for 
sentence length. In other words, the formula presented in this paper incidentally 
and independently also corrects the error found by Dale and Chall and by Lorge. 


Table 2 


Correlations, Means, Standard Deviations, and Regression Weights 
of Personal Words and Sentences 


Ae eS en eS ern ас з. ______....... ы ———= 
——Є—Є—Є—Є—Є——Є————————Є-——————= 


рз Cs X 8 B 
MEMMIUS UNT; о а 
рш 2268 —.3881 7.3457 5.5175 — .3446 
рз = —.2699 29.5745 35.5822 —.1917 


length in words, pw for percentage of “personal words,” ps for percentage 
of “personal sentences," C; for the average grade of children who could 
answer one-half of the test questions correctly, and С» for the average 


grade of children who could answer three-quarters of the test questions 
correctly. 


A New Readability Yardstick 225 
Table 3 
Means and Standard Deviations of Two Criteria 
x 8 
Са 5.4973 1.3877 
Cy 7.3484 2.1945 


Тће two regression formulas based on these correlations are: 

Formula A (for predicting “reading ease"): RE = 206.835 — .846 wl 
— 1.015 sl. 

Тће scores computed by this formula have a range from 0 to 100 for 
almost all samples taken from ordinary prose. A score of 100 corresponds 
to the prediction that a child who has completed fourth grade will be 
able to answer correctly three-quarters of the test questions to be asked 
about the passage that is being rated; in other words, a score of 100 in- 
dicates reading matter that is understandable for persons who have com- 
pleted fourth grade and are, in the language of the U. S. Census, barely 
“functionally literate.” The range of 100 points was arrived at by 
multiplying the grade level prediction by 10, so that a point on the formula 
scale corresponds to one-tenth of a grade. However, this relationship 
holds true only up to about seventh grade; beyond that, the formula 
under-rates grade level to an increasing degree. Finally, the formula— 
which predicted grade level and, therefore, difficulty—was “turned 
around” by reversing the signs to predict “reading ease.” (Before this 
transformation, the formula read: C; = .0846 wl + .1015 sl — 5.6835.) 
The multiple correlation coefficient of this formula is R = .7047. а 

Formula B (for predicting “human interest"): HI = 3.635 pw 
+ .314 ps. 

Scores computed by this formula, too, have a range from 0 to 100. 
А score of 100 has the same meaning as їп Formula A. It indicates read- 
ing matter with enough human interest to suit the reading skills and 
habits of a barely “functionally literate” person. A score of 0, however, 
Means here simply that the passage contains neither “personal words” 
nor “personal sentences”; in contrast to Formula A, the two elements 
Counted here may be totally absent. Since the zero point could be fixed 
1n this way, the scoring was arrived at by dividing the range between 0 
(absence of both elements) and 100 (prediction of completed fourth grade) 

"Y 100. The formula therefore contains no statistical constant. The 
Signs were reversed in the same fashion as in Formula A. (Before trans- 
formation, this formula read: Са = — .1333 pw — .0115 ps + 8.6673.) 

e multiple correlation coefficient of this formula is R = .4306. 

Since the correlations of three of the four elements with the criterion 


226 Rudolf Flesch 


Cy were higher than those with the criterion Сз, the multiple corre 
with the criterion Си was computed first. As a second step, the v 
so found were used to predict criterion Сз, since it seemed obviousl 
more desirable to predict 75% comprehension than 50% comprehensic 
The correlation between the word length factor (syllable count) ай 
the corresponding affix count in the earlier formula was found to be 
т = 87. For practical purposes the two measures may therefore b 
considered equivalent. i 
The number of affixes рег 100 words (a) can be predicted from ti 
syllable count (wl) by the formula: a = .6832 wl — 66.6017. Cor 
versely, the number of syllables per 100 words (wl) can be predicted fi 
the number of affixes (a) by the formula: wl = 1.49 a + 94.56. 


Comment 


It is hoped that the two new formulas will prove more useful than tl 
earlier formula. Formula A alone, with a correlation coefficient of 2 
has almost as high a prediction value as the combined earlier formuli 
whose correlation coefficient was .74. Formula B has a much lo 
correlation coefficient of .43 and, accordingly, does not seem to contri 
much to the measurement of readability. It should be rememb 
however, that because of the criterion used, Formula B predicts о 
the effect of the two “human interest” elements on comprehension; 
other words, the correlation coefficient shows only to what extent hun 
interest in a given text will make the reader understand it better. 
real value of this formula, however, lies in the fact that human in 
will also increase the reader’s attention and his motivation for 
tinued reading. 

In addition, the two new formulas will be more useful for the teach 
of writing, since the added factor and the division into two parts 
show specific faults in writing more clearly. } 
. The significance of Formula A will be more easily understood when | 
is realized that the measurement of word length is indirectly a mea 
ment of word complexity (as mentioned above, the correlation is r = -® 
and that word complexity in turn is indirectly a measurement of al 
straction: the correlation between the number of affixes and that of 
stract words was found to be .78 (5). Similarly, the measureme 
sentence. length is indirectly а measurement of sentence complexi 
In two independent studies the correlation between these two f у 
was found to be .775 (8) and .72 (15). Sentence complexity, in (d 
may again be considered as a measure of abstraction. Formula А, tht 
fore, is essentially a test of the level of abstraction. 


› 


A New Readability Yardstick 227 


It seems hardly necessary to prove the importance of human interest 
in reading, as tested by Formula B. That people are most interested 
in other people is an old truism. And the readability value of written 
dialogue, as tested by the added element, is well described in the following, 
oddly parallel quotations from a printer and a novelist: “Have you ever 
watched people at a library selecting books for home reading? Other 
things being equal, if they see enough pages that . . . promise interest- 
ing dialogue, they are much more apt to put the book under their arm 
and walk away with it, than if they see too many solid pages . . . which 
always suggest hard work" (16). “‘What is the use of а book without 
pictures or conversations? thought Alice just before the White Rabbit 
ran by, in condemnation of the book her sister was reading, and this 
childish comment is supported by novel-readers of all degrees of in- 
telligence. Long close paragraphs of print are in themselves apt to 
dismay the less serious readers and their instinct here is a sound one, for 
an excess of summary and an insufficiency of scene in a novel make the 
story seem remote, without bite, second-hand. . . . A great part of the 
vigor, the vivacity and the readability of Dickens derives from his in- 
numerable interweavings of scene and summary; his general method is 
to keep summary to the barest essential minimum, a mere sentence or 
two here and there between the incredibly fertile burgeoning of his 
scenes” (2). 

In preliminary tests of the formulas, the following results were found: 

When the newly isolated fourth element (“personal sentences") was 


Table 4 
Comparative Analysis of The New Yorker (October 26, 1946) and the 
Reader’s Digest (November, 1946) 
Ss nn 
New Yorker Reader's Digest 
Sa. „со ЦИЕ Ta casia 
Old Formula: 


Average sentence length in words 20 16 

Affixes per 100 words 36 са 

Personal words per 100 words 10 : 

Readability score 3.59 3.05 
New Formula A: 

Average sentence length in words 20 16 

Syllables per 100 words 148 E 

“Reading ease" score 61 m 
New Formula B: 

Personal words per 100 words 10 $ 

Personal sentences per 100 sentences 39 15 


"Human interest” score 49 34 


228 Rudolf Flesch 


applied to the psychology texts by Koffka and James mentioned aboy 
(17), it was found that the percentage of “personal sentences” in Кон 
was negligible (4%), whereas in James's first volume it was 16% 
his second volume 10%. A striking example of this difference in styl 
is the following of James's “personal sentences": “Ask half the comm 
drunkards you know why it is that they fall so often prey to temptati 
and they will say that most of the time they cannot tell.” This se 
shows well the aspect of readability that eluded the earlier formula, 

When the old and the new formulas were applied to two random 
of the New Yorker (October 26, 1946) and the Reader’s Digest (Novem 
1946), the results were as shown in Table 4. 

As can be seen, the old formula rated the Reader's Digest signifie 
more readable than the New Yorker; the new formula A also shows 
the Reader's Digest is significantly easier to read. But the new foi 
B clearly shows a large difference in human interest in favor 
New Yorker. 


'The Formulas Restated 


For practieal applieation, the formulas may be restated this wa; b 
To measure the readability ("reading ease” and “human interest) of 
& piece of writing, go through the following steps: 


Step1. Unless you want to test а whole piece of writing, take ва 
Take enough samples to make a fair test (say, three to five of an a 
and 25 to 30 of a book). Don't try to pick “good” or “typical” sam 
Go by a strictly numerical scheme. For instance, take every thi 
paragraph or every other page. Each sample should start at the 
ginning of a paragraph. ; 

Step 2. Count the words in your piece of writing or, if you are ui 
samples, take each sample and count each word in it up to 100. Ce 
contractions and hyphenated words as one word. Count as 
numbers or letters separated by space. | 

Step 3. Count the syllables in your 100-word samples or, if you ате 
testing а whole piece of writing, compute the number of syllables 
100 words. If in doubt about syllabication rules, use any good diction! 
Count the number of syllables in symbols and figures according to the 
they are normally read aloud, e.g. two for $ (“dollars”) and four for 
("nineteen-eighteen"). If a passage contains several or lengthy fi 
your estimate will be more accurate if you don't include these fi 
your syllable count. In a 100-word sample, be sure to add ins 
corresponding number of words in your syllable count. То save ti 
count all syllables except the first in all words of more than one ву! 


A New Readability Yardstick 229 


and add the total to the number of words tested. It is also helpful to 
“read silently aloud" while counting. 

Step 4. Figure the average sentence length in words for your piece 
of writing or, if you are using samples, for all your samples combined. 
In a 100-word sample, find the sentence that ends nearest to the 100- 
word mark—that might be at the 94th word or the 109th word. Count 
the sentences up to that point and divide the number of words in those 
sentences by the number of sentences. In counting sentences, follow 
the units of thought rather than the punctuation: usually sentences are 
marked off by periods; but sometimes they are marked off by colons or 
semicolons—like these. But don't break up sentences that are joined 
by conjunctions like and or but. 

Step 5. Figure the number of "personal words" per 100 words in 
your piece of writing or, if you are using samples, in all your samples com- 
bined. ‘Personal words" are: (a) АП first-, second-, and third-person 
pronouns except the neuter pronouns it, ifs, itself, and they, them, their, 
theirs, themselves if referring to things rather than people. (b) All words 
that have masculine or feminine natural gender, e.g. Jones, Mary, father, | 
sister, iceman, actress. Do not count common-gender words like teacher, 
doctor, employee, assistant, spouse. Count singular and plural forms. 
(c) The group words people (with the plural verb) and folks. 

Step 6. Figure the number of “personal sentences" per 100 sentences 
in your piece of writing or, if you use samples, in all your samples com- 
bined. “Personal sentences” are: (a) Spoken sentences, marked by quo- 
tation marks or otherwise, often including so-called speech tags like “he 
said” (e.g. “I doubt it."—We told him: “You can take it or leave it.” — 
“That’s all very well," he replied, showing clearly that he didn't believe 
а word of what we said). (b) Questions, commands, requests, and other 
Sentences directly addressed to the reader. (c) Exclamations. (4) 
Grammatically incomplete sentences whose full meaning has to be in- 
ferred from the context (e.g. Doesn't know а word of English.—Hand- 
some, though.—Well, he wasn't.—The minute you walked out). If a 
sentence fits two or more of these definitions, count it only once. Divide 
the number of these “personal sentences" by the total number of sen- 
tences you found in Step 4. 

Step 7. Find your *reading ease" score by inserting the number of 
syllables per 100 words (word length, wl) and the average sentence length 
(sl) in the following formula: 


В.Е. (“reading ease”) = 206.835 — .846 wl — 1.015 sl. 


The “reading ease” score will put your piece of writing on a scale be- 
tween 0 (practically unreadable) and 100 (easy for any literate person). 


230 Rudolf Flesch 


Step8. Find your “human interest” score by inserting the percentage 
of “personal words" (pw) and the percentage of “personal sentences" (ps) 
in the following formula: 


H.I. (“human interest") = 3.635 pw + .314 ps. 


The “human interest” score will put your piece of writing on a scale 
between 0 (no human interest) and 100 (full of human interest). 

In applying the formulas, remember that Formula A measures length 
(the longer the words and sentences, the harder to read) and Formula 
B measures percentages (the more personal words and sentences, the 
more human interest). 

Roughly, “reading ease" scores will tend to follow the pattern shown 
in Table 5. 


“Human interest” scores will follow the general pattern shown in 
Table 6. 


Table 5 
Pattern of “Reading Ease" Scores 
“Reading Ease” Description Typical Syllables Average Sentence 
Score of Style Magazine per 100 Words Length in Words 
0 to 30 Very difficult Scientific 192 or more 29 or more 
30 to 50 Difficult Academic 167 25 
504060 Fairly difficult Quality 155 21 
60 to 70 Standard Digests 147 17 
70to80 Fairly easy Slick-fiction 139 14 
80 to 90 Easy Pulp-fiction 131 11 
90 to 100 Very easy Comics 123 or less 8 or less 
Table 6 


Pattern of “Human Interest” Scores 


“Human ү Percentage of Percentage of 

Interest" Description Typical Personal Personal 
Score of Style Magazine Words Sentences 

Tos gu cR aru) NOTES сезетш 

0 to 10 Dull Scientific 2 or less 0 

10 to 20 Mildly interesting Trade 4 5 

201040 Interesting Digests 7 15 

401060 Highly interesting New Yorker 11 ‚32 

60to100 ^ Dramatic Fiction 17 or more 58 or more 


—— -—— aa 


Sample Application 


As an example of the application of the new formulas, two recent 
descriptions of the “nerve-block” method of anesthesia will be used. 


A New Readability Yardstick 231 


By an odd coincidence, these two variations upon a theme appeared within 
the same week in Life (October 27, 1947) and The New Yorker (October 
25, 1947). The Life story served as text accompanying a series of 
pictures; it is straight reporting, not particularly simple, and lacks human 
interest (which was supplied by the pictures). The New Yorker passage 
is part of a personality profile, vivid, dramatic, using all the tricks of the 
trade to get the reader interested and keep him in suspense. 


From Life: 


Except in the field of surgery, control of pain is still very much in the 
primitive stages. Countless thousands of patients suffer the tortures of cancer, 
angina pectoris and other distressing diseases while their physicians are helpless 
to relieve them. А big step toward help for these sufferers is now being made 
with а treatment known as nerve-blocking. This treatment, which consists 
of putting а “block” between the source of pain and the brain, is not a new 
therapy. But its potentialities are just now being realized. Using better 
drugs and a wider knowledge of the mechanics of pain gained during and since 
the war, Doctors E. А. Rovenstine and E. M. Papper of the New York Univer- 
sity College of Medicine have been able to he р two-thirds of the patients 
accepted for treatment in their “pain clinic” at ellevue Hospital. 

The nerve-block treatment is comparatively simple and does not have 
serious aftereffects. It merely involves the injection of an anesthetic ае 
along the path of the nerve carrying pain impulses from the diseased or injure 
tissue to the brain. Although its action is similar to that of spinal anesthesia 
used in surgery, nerve block generally lasts much longer and is only occasionally 
used for operations. The N. Y. U. doctors have found it effective in a wide 
range of diseases, including angina pectoris, sciatica, shingles, neuralgia and 
some forms of cancer. Relief is not always permanent, but usually the injec- 
tion ean berepeated. Some angina pectoris atients have had relief for periods 
ranging from six months to two years. While recognizing that nerve block 
is no panacea, the doctors feel that results obtained in cases like that of Mike 
Ostroich (next page) will mean a much wider application in the near future. 


From The New Yorker: 


: · Recently, [Rovenstine] devoted а few minutes to relieving а free 


patient in B that had been cut off several years 
in Bellevue of a pain in an arm nutu ached dnd 


e could feel his nails digging into his 
t, Dr. E. M. Papper, reminded Rovenstine 


that a | 1 1 would have been to dig up the 
ma, if te Dui d pla years Ф and straighten outthehand. Roven- 


The man with the pain in the nonexistent hand was an indigent, and 
Rovenstine was Won before a large gallery of student anesthetists and 
visitors when he exorcised the ghosts that were paining him. Some of the 
Spectators, though they felt awed,.also felt inclined to giggle. Even trained 
anesthetists sometimes get into this state during nerve- lock demonstrations 

ecause of the tenseness such feats of magic induce in them. The patient, 


232 Rudolf Flesch 


thin, stark-naked, and an obvious produet of poverty and cheap gin mills, 
was nervous and rather apologetie when he was brought into the operating 
theatre. Не lay face down on the operating table. Rovenstine has an easy 
manner with patients, and as his thick, stubby hands roamed over the man's 
back, he gently asked, “How you doing?" “Му hand, it is all closed together, 
Doe," the man answered, startled and evidently a little proud of the attention 
he was getting. ‘‘You’ll be О.К. soon," Rovenstine said, and turned to the 
audience. “Опе of my greatest contributions to medical science has been the 
use of the eyebrow pencil," hesaid. He took one from the pocket of his white 
smock and made a series of marks on the patient's back, near the shoulder of 
the amputated arm, so that the "cani а could see exactly where he was 
going to work. With а syringe and needle, he raised four small weals on the 
man’s back and then shoved long needles into the weals. The man shuddered 
but said he feli no pain. Rovenstine then attached a syringe to the first 
needle, injected the ргосаше solution, unfastened the syringe, attached it to 
the next needle, injected more of the solution, and so on. "The patient's face 
began to relax a little. “Lord, Doc,” ће said. “Му hand is loosening up а 


bit already." “You'll be all right by tonight, I think," Rovenstine said. 
He was. 


A comparative analysis of these two passages is shown in Table 7. 
The two passages furnish a good illustration of the stylistic features 
measured and emphasized by the two new formulas. 


Table 7 
Comparative Analysis of Treatment of Same Theme in Life and The New Yorker 
Life New Yorker 
(290 words) (495 words) 

Old Formula: 
Average sentence length in words 22 18 
Affixes per 100 words 48 35 
Personal words per 100 words 2 11 
Readability score 5.16 . 3.20 : 

New Formula А: 
Average sentence length in words 22 18 
Syllables per 100 words 165 145 
“Reading ease” score 46 66 

New Formula В: 
Personal words per 100 words 2 11 
Personal sentences per 100 sentences 0 41 
“Human interest” score 7 53 

Received November 8, 1947. 
A References 
1. Alden, J. Lots of names—short sentences—simple words. Printer's Ink, June 29, 


1945, 21-22. 


2. Bentley, Phyllis, Some observations on the art of the narrative. New York: Mac- 
millan, 1947. 


н po 


No 


A New Readability Yardstick 233 


. Cowing, Amy С. They speak his language. J. Home Econ., 1945, 37, 487-489. 
„ Fihe, Pauline J., Wallace, Viola, and Schulz, Martha, compilers. Books for adult 


beginners; grades I to VII. Rey. ed. Chicago: American Library Association, 
1946. 


. Flesch, R. Marks of readable style; a study in adult education. New York: Bur. of 


Publ., Teachers Coll., Columbia Univ., 1943. (Contr. to Educ. No. 897.) 


. Flesch, R. The art of plain talk. New York: Harper & Brothers, 1946. 
. Flesch, R. How to write copy that will be read. Advertising & Selling, March, 


1947, 1138. 


. Gray, W. S., and Leary, Bernice E. What makes a book readable. Chicago: Univ. 


of Chicago Press, 1935. . 


. Gunning, В. Gunning finds papers too hard to read. Editor & Publisher, May 19. 


1945, 12. 


. Lorge, I. Predicting reading difficulty of selections for children. Elem. English 


Rev., 1939, 16, 229-233. 


. McCall, W. A., and Crabbs, Lelah M. Standard test lessons in reading. Books П, 


III, ТУ, and У. New York: Bur. of Publ., Teachers Coll., Columbia Univ., 1926. 


. Miller, L. R. Reading grade placement of the first 23 books awarded the John 


Newbery prize. Elem. Sch. J., 1946, 394-399. 


. Murphy, D. R. Test proves short sentences and words get best readership. 


Printer’s Ink, 1947, 218, 61-64. 


. Murphy, D. R. How plain talk increases readership 45% to 66%. Printer's Ink, 


1947, 220, 35-37. 


. Sanford, F.H. Individual differences in the mode of verbal expression. Unpublished 


Ph.D. thesis. Harvard Univ., 1941. 


. Sherbow, B. Making type work. New York: Century, 1916. 
. Stevens, 8. 8., and Stone, Geraldine. Psychological writing, easy and hard. Amer, 


Psychologist, 1947, 2, 230-235. Discussion, 1947, 2, 523-525. 


. Foreign news written over heads of readers. Editor & Publisher, Dec. 28, 1946, 28. 
- How does your writing read? U. 8. Civil Service Commission. Washington: U. 8. 


Government Printing Office, 1946. 


. Readability in news writing; report on an experiment by United Press. New York: 


United Press Associations, 1945. 


The Purdue Pegboard: Norms and Studies of 
Reliability and Validity 


Joseph Tiffin and E. J. Asher 
Division of Education and Applied Psychology, Purdue University 


Extensive use of the Purdue Pegboard! in testing industri^! applicants, 
veterans, college students, and individuals seeking vocatio: 4l guidance 
has made available a considerable body of information regar.iing the re- 
liability and validity of the test, practice effects, and the general nature 
of finger dexterity. This paper summarizes various studies of the relia- 
bility and validity of the Purdue Pegboard dexterity tests and presents 
а revised set of norms for the tests together with a diseussion of the im- 
plication of these findings for personnel testing. 

The Purdue Pegboard is a test of manipulative dexterity designed to 
assist in the selection of employees in industrial jobs requiring manipu- 
lative dexterity, such as assembly, packing, operation of certain machines, 
and other routine manual jobs of an exacting nature. It provides sepa- 
rate measurements of the right hand, left hand, and both hands together, 
and measures dexterity for two types of activity: one involving gross 
movements of hand, fingers, and arms, and the other involving primarily 
what might be called “tip of the finger" dexterity needed in small as- 
sembly work. 


Construction 


Extensive observation and experiments have shown that people differ 
markedly in their ability to perform the manipulative operations that are 
required on many industrial jobs. Experiments have also shown that 
the basic dexterity of an employee as revealed by a manipulative dex- 
terity test is related to both quantity and quality of work on various jobs 
requiring such dexterity. Numerous dexterity tests, both in the form of 
pegboards and in other forms, have been in use for some time in many 
industrial plants. "The Purdue Pegboard was designed to incorporate the 
desirable features of several of these tests into a simple and easily ad- 
ministered performance test. Of particular importance in its construc- 
tion and administration has been the standardization under conditions 
in which а group of applicants or employees can be tested simultaneously. 
An examiner using a battery of ten boards can test approximately 50 

1 Distributed for the Purdue Research Foundation by Science Research Associates, 
228 8. Wabash, Chicago, Ill. 
234 


The Purdue Pegboard 235 
employees per hour, thus overcoming the tedious and high cost admini- 
stration that has characterized many other dexterity tests. The present 
form of the Purdue Pegboard has been standardized after extensive ex- 
perimentation in numerous plants which involved the testing of several 
thousand employees in a wide variety of industrial jobs. 


Fic. 1. The Purdue Pegboard. 


The Test Scores. Five separate test scores may be obtained with the 
Purdue Pegboard, namely: Right Hand; Left Hand; Both Hands; Right 
plus Left plus Both Hands (abbreviated R + L + B); and Assembly. 


Administration and Scoring 


The Pegboard is equipped with the pins, collars, and washers located 
in the proper cups. ‘The operator should be seated comfortably at a 
table approximately 30 inches high. The Pegboard should be directly 
in front of the operator with the cups containing the pins and other parts 
at the far end of the board. The extreme right and extreme left hand 
cups should each contain 25 pins. The pocket immediately to the right 
of the center should contain 20 collars, and the pocket immediately to 
the left of the center should contain 40 washers. 

Right Hand Test. The testee is instructed to pick up one pin at a time 
With the right hand from the right hand cup and place these pins in the 
right hand row, starting with the top hole. The testee is allowed to put 
in three or four pins for practice before this part of thetestisbegun. The 


236 Joseph Tiffin and E. J. Asher 


pins are then removed and the testee is allowed exactly 30 seconds to pul 
in as many pins as possible with the right hand, taking the pins from W 
right hand cup one at a time. The right hand score is the total nu 
pins the testee places with the right hand. [ 

Left Hand Test. The procedure described above is followed for the” 
left hand. Practice with the left hand in placing three or four pins sho 
precede the administration of this part of the test. The testee is ti 
allowed exactly 30 seconds to take pins one at a time from the left hand 
and place them in the left hand row of holes starting with the top hole. 
left hand score is the total number of pins the testee places with the left ha 
After the right and left hand sequences have been administered, all p 
are returned by the testee to the right and left cups, respectively. 

Both Hands Test. This sequence tests both hands working tog 
The testee simultaneously takes a pin from the right hand cup with 
right hand and a pin from the left hand cup with the left hand, ands 
ultaneously places both pins in the two rows of holes, starting with 
pair of holes farthest away from the testee. Practice in placing three 0 
four pairs of pins should be allowed before this test sequence is given. - 
After this practice and after all pins have been returned to their proper _ 
cups, the testee should be allowed exactly 30 seconds to place as many і 
pairs of pins as possible, using both hands, each hand picking up and placing 
one pin at a time. The both hands score is the number of pairs of pins that 
are placed during the 30 second test period. 

Right plus Left plus Both Hands. This score is obtained by combining ! 
the test scores obtained from the test sequences described above. The ( 
score is simply the number of pins placed with the right hand plus the 
number placed with the left hand plus the number of pairs placed with 
both hands. It should be remembered that the number of pairs of pins 
is used in adding the both hands score—not the number of pins placed 
with both hands. 

Assembly. This sequence tests more minute finger dexterity 4 
consists of assembling the pins, collars, and washers. To make се 
that the testee understands just what he is to do, the administrator should: 
instruct him as follows: “Pick up one pin from the right hand cup with 
your right hand and while placing it in the top hole in the right hand row 
pick up a washer with your left hand. As soon as the pin has been placed, 
drop the washer over the pin. While the washer is being placed over the 
pin with the left hand, pick up a collar with the right hand. While tl 
collar is being dropped over the pin, pick up another washer with the let 
hand and drop it over the collar. This completes the first ‘assembly 

/ consisting of a pin, a washer, a collar, and a washer. As the final wa 
for the first assembly is being placed with the left hand, start the вес 


The Purdue Pegboard 237 


assembly immediately by picking up another pin with the right hand. 
Place it in the next hole, drop a washer over it with the left hand; then a 
eollar with the right hand, and so on, completing another assembly. 
Now, return pins, collars, and washers to their proper eups and get ready 
tostart. You will be given one minute to make as many assemblies as 
you can." 

The most important point to explain in the instruction for this se- 
quence is that both hands should be operating all of the time, one picking 
up a pin, one a washer, one a collar, and so on. If necessary, the testee 
should be allowed to assemble four or five complete pin-washer-collar- 
washer assemblies before this test is begun, in order to make certain that 
he fully understands the “alternating” procedure. The testee must keep 
both hands moving at the same time. If he fails to do this, he should be 
given further instruction by the administrator. е 

When the testee is familiar with the procedure of making the assem- 
blies, he should be allowed exactly one minute to make as many such assem- 
blies as possible. The score on the assembly test is the number of parts 
assembled during the one minute of testing time. If eight complete 
assemblies are made, the score is therefore 32, since each assembly con- 
sists of four parts. If six complete assemblies are made and the pin and 
first washer of the seventh assembly are properly placed at the end of 
the minute, the score is 24 plus 2 ог 26. This method of scoring is simpler 
than using the number of complete assemblies because it eliminates the 
necessity of using quarters of an assembly in determining the score. 


Norms 


Scores on the Purdue Pegboard dexterity tests have been accumulated 
over a period of several years from а number of different sources. Scores 
for one trial on each of the tests are available at the present time on the 
following groups: 


N 
College Men 461 
College Women 392 
Veterans (Men) 1958 
Industrial Applicants (Men) 865 
Industrial Applicants (Women) 4138 


This can be seen in Table 1 in which the means and standard deviations 
for each population on each test are given. There is no appreciable differ- 
ence between the mean scores of college men and the mean scores of vet- 
erans on the various tests. There is no appreciable difference between 


238 Joseph Tiffin and E. J. Asher 


college women and industrial women on the first four of the tests, A 
sizable difference does exist, however, between the means of the two 
groups on the assembly test. As would be expected from the data in 
Table 1, it was found that the percentile norms for college men and vet- 
erans were almost identical and that those for college women and in- 
dustrial women were the same except for the assembly test. 


Table 1 
Means and Standard Deviations of Various Groups on Purdue Pegboard 
College Industrial College Industrial 
Men Veterans Men Women Women 
N=481 N=198 N=865 М№= 3920 М = 4138 
Right Hand 
M 16.43 16.75 15.87 17.76 17.70 
SD 1.80 1.92 2.09 1.98 1.83 
Left Hand 
M 15.91 15.98 15.16 16.48 15.98 
SD 177 1.99 1.98 1.66 1.99 
Both Hands 
M 13.33 13.27 12.53 13.93 14.15 
SD 1.50 1.69 1.79 1.55 1.55 
Total 
M 45.67 45.97 43.57 48.16 47.55 
SD 4.02 4.86 4.94 © 417 4.62 
Assembly 
M 37.52 36.72 33.07 39.08 36.68 
SD 5.71 5.84 6.25 5.44 6.76 


a зра 


In view of these facts a single set of norms was calculated for college 
men and veterans, and similarly a single set of norms was calculated for 
college women and industrial women on all the tests except the assembly 
test. One and three trial percentile norms are given in Table 2 for college 
men and veterans, industrial men, college women and industrial women. 

Complete data from which three trial norms could be calculated for the 
several groups were not available on so extensive a basis as the data for 
one trial. However, data showing the improvement on the second and 
third trials were available on approximately 500 college students. The 
improvement from one to two to three trials on each of the tests (excepting 
the score obtained by adding the right hand, left hand, and both hand 
scores) is shown in Table 3. The improvement in scores shown in this 
table was used in calculating the expected three trial scores for the in- 
dustrial and veteran populations. These extrapolated scores were used 
in deriving the three trial percentile norms shown in Table 2. 


The Purdue Pegboard 


| Table 2 
Percentile Norms for the Purdue Pegboard 


Right Hand 
Men Women 


Veterans and Industrial Applicants 
Industrial Applicants ~ College Students and College Studenta 


1 Trial 3 Trials 1 Trial 3 Trials 1 Trial 3 Trials 
Score Gile Score 91е Score %ile Score 91е Score Фе Score %ile 


100 
99 22 
97 21 


– 


САО ЕЕ ЕЕЕ 


5 
-« 545839088 8 


21 100 60 99 22 100 


SESESETSEESSPS 


N 

5 

8 

= 

8 
$955:5555T9S95928 
vuon BASIS 

= 


62 100 64 100 

20 100 60 100 21 100 60 99 21 100 62 99 

| 1- 99 5 98 20 99 58 96 20 99 $60 97 
‘ 1l8 97 56 97 19 98 6 '91 19 96 5 94 
M 90 64 92 18 92 54 88 18 90 56 89 

16 74 52 8 17 80 52 73 7 75 54 79 

5 55 50 73 16 61 50 60 16 53 52 65 

14 38 дв а 15 49: 485. - 47 15 29 50 50 

13 32 в 49 14 20 46 34 34 (1/14. УЧКА. 7347 

Ec о 4 s 18: 5/9 20 13 . 6, 46 22 
lH 5 42 28 ӘМАА НАТ Boo po cg 

0 з 4 17 11 s a он VEU SAIS НАД YT eS 

SUN tag: oto 104 Abe sobre 4 3 

36 6 86 2 38 2 

24 24 1 36 g 


1 Trial 


BESRBSSRSSSSRE 


vanlig 


28 


аега 


-anolis 


Joseph Tifin and E. J. Asher 


240 
Industrial Applicants 
3 Trials 


“глог ов ваза 


- 


"EUUTTCEPULELEPEEEEETI 


Table 2 (Cont.) 
Both Hands 
Veterans 
College Students 

1 Trial 3 Trials 
Score %ile Score %ile 
17 100 52 100 
16 98 50 99 
15 9 48 96 
14 7 46 92 
13 55 44 84 
12 30 42 70 
п 12 40 52 
10 3 38 35 
9 1 36 21 
34 11 
32 5 
30 2 
29 1 

Right, Left and Both 
57 100 176 100 
56 99 172 98 
54 97 168 97 
52 95 164 95 
50 87 160 93 
48 75 156 87 
46 55 152 79 
44 39 148 66 
42 23 144 54 
40 12 140 43 
38 5 136 33 
36 2 132 28 
35 1 128 15 
124 9 
120 5 
116 3 
112 1 


Women  — 


Industrial Applicants 
and College 8 


Trial За 
Score %ile Score % 
19 100 56 
18 99 — 54 й 
7 o HM 
16 95 50! 
5 s 4 
14 62 46 $8 
13 37 44  ! 
12 16 42  ! 
п 5 40: 
10 2 38 
9 1 36 — 
34 4 
[E 


59 100 182 
58 99 180 
56 97 176 
БА 94 172 
52 85 168 
50 72 164 
48 56 160 
46 39 156 
44 24 152 
42 18 145 
40 5 144 
38 2 140 
37 1 136 
182 
128 
124 — 
120 - 
116 — 


1 89 

= г [24 1 ©, | c9 

а EID gg E 

5 08 © 08 Е 89 

T ws 1 уз * ys S [24 

г 88 п 88 9 88 L 92 

Е c6 st c6 8 св от 08 

9 96 61 96 I 0c It 96 £I 78 
6 001 92 oor © КА || 001 1 Са 8I 88 I 8t 
+ ют 1.26 18 тот 9 а 0c vot t те 9 сб £ 0c 
6t 801 z 9c 82 801 6 9% 86 801 9 96 1 96 9 сс 
ez ги £ 8% oF сп v 82 18 сп 8 82 86 001 or La 
sz оп L 0g ze 9п [44 0g sv эт £I 0g 6t yor ST 9% 
se oct и се 09 Oct 18 се 99 Oct 0% се 18 801 9 86 
i 9r та 1% re 89 vol 154 ve #9 та се ve "9 [411 se og 
s9 821 8% 98 94 821 eg 9g ©, 8ZI 9 98 GL эт 6r 28 
& во сет or 86 св сет v9 88 84 сет 69 86 LL оёт 09 Ре 
3 о, 981 se OF 98 9ет 94 OF +8 981 ZL or 28 yet eL 9€ 
B cw o u & б Of B € 68 ой 8 & Ja BI 08 8 
& 68 Ри св Ld £6 ү 06 144 z6 ҰРТ 68 142 16 сет 18 or 
® #6 SPI z6 oF ©6 SPI v6 9r v6 St £6 9F 56 981 £6 (84 
~ 16 291 16 8r 16 est 8 8r 96 est 96 SF 16 OFT 16 144 
66 961 66 og 66 991 66 09 66 991 66 08 66 PU 66 oF 
от 091 oor 1 oor 091 oor т 001 091 oor т 007  SFI 001 2 

a aog AÁ% 91008 ep?5 awg эәп% 91008 ep?5 aog  оПф 91008 эп% aog эәп% 91008 
sieur € PHL 1 SHL 8 PHL I SEUL € PHL I SENIL € PHL I 
supmg Ijo suwyddy qvursnpug sjuopnjg әйәүүогу вупвоца фу үвыувприү 
put sug19]9A 
пошо A uo 
A[quiossy 


ў (300)) Z ILL 
MEE 1 0 o c M. 1íá4í! 


242 Joseph Tiffin and E. J. Asher 


It is significant to note in Table 3 that the improvement on each test 
from trial one to trial two is less than one times the standard error of the 
difference. The improvement from trial two to trial three is still smaller. 
'The improvement from trial one to trial three, while larger than between 
trial one and trial two, is not large enough to satisfy the commonly used 
eriterion of statistical significance. 


Table 3 


Effect of Practice on the Purdue Pegboard Dexterity Tests 
(N = 434 College Students) 


Difference between Means 


Ist and 2nd and Ist and 
Test M 2nd Trials 8rd Trials 3rd Trials 


Right Hand Ist Trial 17.19 
Right Hand 2nd Trial 18.42 1.234130 39130 1.62 + 1.33 
Right Hand 3rd Trial, 18.81 


Left Hand 1st Trial 16.07 
Left Hand 2nd Trial 16.99 9922-115  .37-2115 129 +119 
Left Hand 8rd Trial 17.36 
Both Hands Ist Trial . 13.68 
Both Hands 2nd Trial 14.24 5641.10 3341.11 .89 + 1.10 
Both Hands 8rd Trial 14.57 
Assembly 1st Trial 39.64 
Assembly 2nd Trial 42.68 3.04 + 3.63 2.004361 5.04 + 3.56 
Assembly 3rd Trial 44.08 


lr —— I EEE 


Whether one uses the one trial or three trial method of administration 
and the corresponding norms for the Purdue Pegboard dexterity tests 
depends basically upon the situation in which the tests are used, the 
groups to be tested, and the purpose for which the testing is being done. 
"These considerations should in turn be viewed in the light of the reliabil- 
ities of the one and three trial scores. For jobs in which the success 0 
the testing program depends more upon increasing the average success 0 
employees placed than upon individual measurement and counseling, the 
single trial method of administration is often satisfactory. However, 
when the most precise measurement possible of every individual is desired, 
as in vocational guidance, it is recommended that the three trial metho 
of administering the tests be followed. 

It can be seen in Table 1 that veterans are significantly superior oD all 
of the Pegboard tests to industrial men. This difference in the two 
populations is reflected in the Table of Norms. The fact that norms 


The Purdue Pegboard 243 


upon industrial workers are too low for veterans has been pointed 
by Long and Hill? and by Strange and Sartain.* This could well be 
‘to the fact that the norms for veterans are based upon those veterans 
have voluntarily consulted a Veterans Guidance Center, and vet- 
who seek such counsel are not a randomly selected group of vet- 
n general. 
this connection it is well to keep in mind that a table of norms is 
more nor less than the scores of some group or groups on & 
lar test so scaled as to represent a sort of human measuring stick. 
mparison of an individual against this measuring stick can only 
e where in the group the individual stands. If the statement, in 
erical terms, of an individual's position in the group does not pro- 
| апу useful information about the person or is not the information 
red, nothing is to be gained from comparing the individual with the 
їр or groups represented in the table of norms, even if no other norms 
available. In counseling an individual or assessing his qualifications 
a particular job it is well to keep in mind that the most significant in- 
tion regarding his test performance is a statement of where he stands 
group with which he expects or hopes to compete. А high school 
r who is considering the advisability of industrial work in a parti- 
т plant would find that a statement of his test performance in relation 
ent employees of the plant on the job under consideration is of 
re value than a statement of his position among high school seniors, 
n though norms for high school seniors are available. This point is 
"ther illustrated in personnel testing. It happens repeatedly that the 
mation needed about an applicant is how he compares with the 
ers on the specific job for which he is an applicant, not with industrial 
ersin general. It is necessary in such cases to obtain separate norms 


tly different does not, therefore, mean that veterans’ scores 
always be interpreted in terms of veterans’ norms. Scores should 
be interpreted in terms of norms set up from a population from 
\ ог jobs on which the men being tested may be vocationally placed. 


Reliability 
ble 4 summarizes the results of several studies on the reliability 
е several tests given by means of the Purdue Pegboard. The relia- 
is Long and John Hill, Additional norms for the Purdue Pegboard. Occupa- 


1947, 26, 160-161. 
Strange and A. Q. Sartain. Veterans’ scores on the Purdue Pegboard. J. 


'Sychol., 1948, 32, 35-40. 


244 Joseph Tifin and E. Ј. Asher 


bility coefficients for the one trial method of administering and scoring 
the several tests were obtained by correlating test-retest scores om the 
groups indicated. The reliability coefficients for three trial administra- 
tion were obtained by stepping up the one trial reliabilities by means of 
the Spearman-Brown prophecy formula. Stepped up reliabilities on а 
two trial basis are not given in Table 4, because the norms given above 
cover only one trial and three trial methods of administration. Users 
of the Pegboard can readily compute what the reliabilities of the tests on 
а two trial basis would be by stepping up the one trial reliabilities to tests 
of double their present length. 


Table 4 
Reliability Coefficients for the Purdue Pegboard Dexterity Tests 
1 3 

Test Group N Trial Trials*** 
Lol STi) h A ТАЛЬ А 1 ааваа 
Right Hand College Students (men and women) 44 .63* У 
Left Hand College Students (men and women) 434  .60% .82 
Both Hands College Students (men and women) 433. .68* .86 
Right+Left+Both College Students (men) 159. 415 589 
Assembly College Students (men and women) 434 — 685 ` 288 
Assembly Radio Tube Mounter Trainees 288. 76. ЫБ 


(women) 
ПИ c 00 — 


* Test-retest reliabilities of college students at Purdue University. 
** From L. V. Surgent. The use of aptitude tests in the selection of radio tube 
mounters. Psychol. Monogr. 1947, 61, 1-40. 
*** Three trial reliabilities obtained in each case by “stepping up” one trial reliability 
by means of the Spearman-Brown prophecy formula. 


For many industrial purposes, the reliabilities of the one trial tests 
are sufficiently high to justify the use of this method of test administra- 
tion, provided, of course, that the tests are found to have significant 
validity for the particular jobs for which they are to be used. 

Ш an industrial situation where the highest expectancy of a validity 
coefficient is in the vicinity of .50, there is little to be gained by lengthen- 

‚ Ing the tests to increase their reliabilities above the values for one trial 
administration given in Table 4. In the usual formula for correction for | 
attenuation, if .50 is, by definition, the “ceiling” of expected validity, аһ 
improvement from a .60, for example, to a .90 reliability coefficient 0 
the test will only increase the obtained validity coefficient from .40 to AT, 
assuming that the reliability of the criterion is 1.00. From the functional 
viewpoint, a test having a validity coefficient of .40 can be made to work 
as satisfactorily as one having a validity coefficient of .47 if it is possible 


» 


The Purdue Pegboard 245 


to reduce slightly the selection or placement ratio When employees 
are being hired for a variety of jobs (as is nearly always the case in an ex- 
pansion of personnel) such a reduction is usually feasible. Therefore, to 
use the one trial method of administration of the test is often as satis- 
factory in an industrial situation as the longer and more reliable three 
trial method. 
Validity 

Generalizations concerning the validity of any test should be made 
with great caution, and this is particularly true of dexterity tests. As 
Seashore’ has reported, motor skills are quite specific and ordinarily not 
highly correlated with each other. This situation perhaps accounts for 
the fact t'at a given dexterity test may have a quite satisfactory validity 
for certain manipulative jobs and be unsuitable for other manipulative 
jobs which might seem to be very similar. While the motor skills meas- 
ured by the different Pegboard tests may not be as specific as those re- 
ferred to by Seashore, the intercorrelations shown in Table 5 indicate that 


Table 5 
Intercorrelations of Three Trial Scores on the Purdue Pegboard 
Dexterity Tests (N = 434 College Students) 
Note: Coefficients corrected for attenuation are printed in bold face type. 


ee 


i Both 
od Hana Hands Assembly 
Se, ee 
Right Hand 59 .69 52 
71 81 61 
Left Hand 59 67 50 
71 .80 .60 
Both Hands .69 67 58 
81 80 67 
Assembly 52 50 58 
61 60 67 


ви = пас лан НИ НЕ 


the Pegboard tests do measure somewhat different skills. This is more 
true of the assembly test than it is of the other three tests as one might 
expect from the nature of the performances required on the assembly test. 
Each of the tests is sufficiently unique to indicate that it is highly de- 
sirable to conduct a study of the validity of the separate tests among 
_ ‘As described by Н. C. Taylor and J. T. Russell. The relationship of validity coeffi- 
cients to the practical effectiveness of tests in selection: Discussion and tables. J. appl. 
Psychol., 1939, 23, 565-578. 

* R. H. Seashore. . Standard motor skills unit. Psychol. Monogr., 1928, 39, 51-66, 
and Individual differences in motor skills, J. gen. Psychol., 1930, 3, 38-66. 


246 Joseph Tifin and E. J. Asher 


employees on specific jobs for which the use of each test is contemplated. 
A number of studies of this type have been conducted, and a brief sum- 
mary of the results is given in Table 6. The validity coefficients obtained 


Table 6 
Results of Validity Studies with the Purdue Pegboard 
No. of 
Test Trials Job Criterion N r 
Right Hand 1 Light machine operation Make-up pay while learning 17 .56 
Left Hand 1 Light machine operation Make-up pay while learning 17 .23 
Both Hands 1 Light machine operation Make-up pay while learning 17 .21 
R+L+B 1 Lightmachine operation Make-up pay while learning 16 .31 
Assembly 1 Light machine operation Make-up pay while learning 16 .38 
Right Hand 1 Light machine operation _ Earnings after learning 17 52 
period 
Left Hand 1 Light machine operation Earnings after learning 17 .20 
period 
Both Hands 1 Light machine operation Earnings after learning 17 .07 
period 
R+L+B 1 Light machine operation Earnings after learning 16 .33 
` period 
Assembly 1 Light machine operation Earnings after learning 16 .38 
period 
Assembly 1 Textile quilling Production Index 28 445 
Right Hand 1 Simple assembly of small Production Index 15 26 
parts 
Assembly 1  Simpleassemblyofsmall Production Index 15 26 
parts 
Assembly 3 Radio tube mounters 3 or more pooled overall 283 .64* 
ratings 


M а о ооз ы ые _________----= 
* From Surgent, op. cil. 


in these several studies are given in the last column of Table 6, and furnish 
а representative sample of the validity coefficients that may be expected 
on various manipulative jobs with various criteria of job success. 


Summary 


^ Extensive Purdue Pegboard norms on several male and female popula- 
tions have been obtained from various industrial users of this test during 
the past few years. The groups on which test scores were available have 
been combined whenever such combination was justified by statistical 
similarity of groups in mean and standard deviation. After the indicate 


combinations were made, separate tables of percentile norms were set up * 
for the following groups: 


The Purdue Pegboard 247 


Men Women 
Right Hand 1. Industrial applicants l. Industrial applicants com- 
2. Veterans combined with bined with college students 
college students 
Left Hand 1. Industrial applicants 1. Industrial applicants. com- 
2. Veterans combined with bined with college students 
college students D 
Both Hands 1. Industrial applicants 1. Industrial applicants com- 
2. Veterans combined with bined with college students 
college students 
Right -- Left--Both 1. Industrial applicants 1. Industrial applicants com- 
2. Veterans combined with bined with college students 
college students 


Assembly 


^ 


. Industrial applicants 1. Industrial applicants 
2. Veterans combined with 2. College students 
college students 


Norms for the above groups are given for both the one trial and three 
trial method of administration. The three trial norms were extrapolated 
from one trial norms, using data on the effect of practice obtained from 
one trial, two trial, and three trial administration of the Pegboard tests 
to 434 college students. 

Intercorrelations of the several Pegboard tests were computed from 
scores of 434 college men and women. The intercorrelations ranged from 
-50 to .69. i 

A summary of validity studies of the Pegboard for several industrial 
jobs is included. The obtained validity coefficients from 14 studies 
ranged from .07 to .76. The variation of these validity coefficients among 
industrial jobs, all of which were of a manual repetitive type, serves to 
re-emphasize the fact that the validity of the Pegboard should be sepa- 
rately determined for each job for which its use is contemplated. 

Received February 24, 1948. 
Early publication, 


The Development of Entrance Tests for the 
United States Coast Guard Academy * 


Sidney H. Newman and Joseph M. Bobbitt 
United States Public Health Service, Washington, D. C. 


The psychological program instituted at the United States Coast 
Guard Academy during the late war (1, 2, 3, 4) has continued in full 
force. One major objective of this program has been to improve the 
methods of selecting Cadets for the Academy, and it is now possible to 
indieate the changes which have been effected to date in Cadet selection 
methods. The program furnishes an excellent illustration of the way in 
which psychological research and application need to be coordinated to 
obtain satisfactory results in developing selection methods for an institu- 
tion such as the Academy. It also gives a concrete example of the high 
degree of cooperation that ean develop between military officials and 
psychologists, both civilian and military, when they learn the value of 
their respective methods of approach to mutual problems. The manner 
of the operation of the Academy program, including full freedom for 
research and publication of findings, supplies one answer to Tyson’s (8) 
- criticisms of military psychology, lending support to the views of Older 
(6) on this point. 

The Coast Guard Academy offers a four year course leading to а 
Bachelor of Science degree in engineering and a commission as ensign. 
In addition to the collegiate academic work at a high level of difficulty, 
the Cadet receives training in professional Coast Guard and Naval sub- 
jects and intensive work in physical education. He is under the super- 
vision of the Academy for eleven months out of the year, and every effort 
is made to determine whether or not the Cadet exhibits the academic, 
personality, and physical qualifications considered necessary for Coast 
Guard officers. Therefore, Cadet selection needs to be much more 
rigorous and comprehensive than does the selection of college students. 
Approximately 150 Cadets are selected each year from an applicant pop- 
ulation which has ranged from 700 to 2200 young men in recent years: 


History 
Cadets are selected only on the basis of annual national competitive 
tests, examinations, and evaluations (conducted each May for July en- 


* The opinions or assertions contained in this paper are those of the authors and 
are not to be construed as official or as reflecting the views of the U. S. Coast Guard. 


248 


Development of Entrance Tests 249 


trants). Before the war, candidates were required to take written essay 
type examinations in high school English (three hours covering grammar, 
composition, and literature) and mathematics (three and a half hours 
covering algebra, plane and solid geometry, and trigonometry) (10). 
These tests were constructed by the respective Academy departments. 
In addition, the examiners (Civil Service and Coast Guard officers) in- 
terviewed each candidate and reported on his general fitness and adapta- 
bility for service as a Coast Guard officer. After the examination papers 
were graded on a percentage basis, an Adaptability Board of three com- 
missioned officers considered the general adaptability (leadership and 
personality characteristics) of those candidates who received not less than 
70 per cent in mathematics and English. The Board assigned marks in 
adaptability based on the examiner’s report, information on previous 
scholastic work, leadership achievements, and athletic and physical char- 
acteristics. The final mark was obtained by averaging the percentage 
grades in mathematics, English, and general adaptability. 

On the basis of research findings and experience, changes in the 
national competitive procedures have taken place gradually; an entirely 
new system was put into effect at the time of the'entrance examinations 
conducted in 1947. Research was begun in July, 1943, involving the 
class entering at that time and graduating in June, 1946. In the 1944 
tests, a vocabulary test developed by the personnel of the psychological 
program, an objective English test furnished by the Cooperative Test 
Service, and a 20 item Personal Inventory, selected from the Navy Per- 
sonal Inventory (7), were introduced. The Personal Inventory consti- 
tuted the first attempt to appraise objectively some of the characteristics 
considered in the adaptability grade. The written problems in mathe- 
matics and the English essay were retained. 

The tests given in 1945 (11) contained objective English and mathe- 
maties examinations obtained by special arrangement with the Measure- 
ment and Guidance Project in Engineering Education! The English 
essay, the vocabulary test, and some written problems in mathematics 
were also administered. The Personal Inventory, U. 8. Coast Guard 
Academy form, was expanded to include the 60 items of the Navy Per- 
Sonal Inventory and 85 new experimental items based on Academy re- 
search and clinical experience. A highly difficult test of quantitative and 
verbal aptitudes, very ably constructed by L. L. and T. G. Thurstone on 
the basis of data furnished by Academy psychologists, was introduced in 
the 1945 competition. At this time a rating scale covering eleven charac- 
teristics was devised to aid examiners in reporting the results of inter- 
Viewing the candidates. 


1437 West{59th Street, New York City. 


250 Sidney Н. Newman and Joseph М. Bobbitt 


In the tests and examinations administered in 1946 the hem 
and English tests were completely objective, constructed by the C 
Entrance Examination Board (9), which utilized the findings 
Academy research program. The expanded Personal Inventory 
aptitude tests were continued. An Index of Background, Activities, $ 
Preferences, developed at the Academy to aid in the evaluation of айар! 
bility characteristics, was instituted. During the entire period ш 
discussion, 1944—46, the final entrance mark was calculated by averagi 
the percentage grades in English, mathematics, and adaptability. 


New Annual Competitive Tests 


The entrance tests administered in 1947 represented the с 
gradual changes which have resulted in a new system. The Re 
Governing Appointments to Cadetships, revised June, 1946 (12), 
the aims and methods of the competitive examinations. This di 
was based on findings and recommendations presented in a pro 
port on the Academy psychological program (3) and other releva 
formation. The Regulations stated, “Successful completion of 
Academy course and success as an officer depends (1) on an adequ 
educational background, (2) on the possession of aptitudes rela 
both technical and cultural studies, (3) on a sincere interest in the 

. Guard as a career, and (4) on relevant personality and physical 
teristics. . . . The complete examination will measure as fairly 
accurately as possible the extent to which the candidate meets the 
general qualifications listed above. The tests will be objective in fi 
except that candidates may be required to write one or more short 
essays on specified subjects.” 


The Regulations indicated that tests would be given m | 
following fields: ' 


A. Achievement tests. 
1. English (Grammar, Composition, Literature, and 
Comprehension). ( 
2. Social Studies (American History, American Government: 
Political Science, Economies, and Current Events). 
3. Mathematies (Algebra, Plane Geometry, and Plane 
nometry). 
4. Science (Physics, Chemistry, or both). 
B. Aptitude and ability tests. 
1. Quantitative, mathematical ability. 
2. Verbal or linguistic ability. 
3. Ability to visualize spatial relations. 


Development of Entrance Tests 251 


4. Mechanical comprehension and ability to deal with mechanical 
problems. 

5. Aptitudes involved in scientific comprehension, study, and 
research. 


In addition to the components of the entrance examinations listed 
above, the Regulations stated that tests of emotional stability, social 
adjustment, interests, background, and personality characteristics can 
be administered to aid the Adaptability Board in the evaluation of 
adaptability. 

The Regulations stipulated that the raw test scores are to be converted 
to standard scores. The Board may then set separate minimum require- 
ments in terms of standard scores on the achievement and aptitude tests, 
and candidates who fall below these levels can be eliminated from further 
consideration. The final mark of each candidate is computed by aver- 
aging the six sub-scores (all converted to standard scores) in accordance 
with the indicated weights: 


Weights 
English 20 
Social Studies 10 
Mathematics 20 
Science 10 
Aptitudes and Abilities 10 
General Adaptability 30 


Candidates are offered appointments in the order of their final marks 
until the vacancies for the year have been filled. A candidate who fails 
to be appointed may compete again in subsequent years without pre- 
judice provided he still meets the age and physical requirements. 

The entrance examinations administered in May, 1947, were com- 
pletely objective and included the following: mathematics (two hours), 
English (two hours), social studies (one hour), science (one hour), verbal 
aptitude (35 minutes), spatial aptitude (35 minutes), mathematical apti- 
tude (35 minutes), Personal Inventory, Coast Guard Academy form (45 
minutes), and an expanded Index of Background, Activities, and Prefer- 
ences (90 minutes). The College Entrance Examination Board con- 
structed the achievement and aptitude tests, and will continue to do so 
in the future. 

* In order to be qualified to take the competitive tests а candidate must be between 
the ages of 17 and 22. At the time of the tests, he must have completed satisfactorily 
three and one-half years of high school work with 7 units in required work, including 
three in mathematics, three in English, and one in physics. He must also pass а pre- 
liminary physical examination before being allowed to take the entrance tests. In 
addition to sufficiently high standing on the entrance examinations, high school gradua- 
tion is required for entrance to the Academy. 


252 Sidney Н. Newman and Joseph M. Bobbitt 


It is not expected that the kinds of tests or the weightings g 
tests scores will remain static. It is anticipated that the tests 
methods of computing the final entrance mark will change in acco 
with research findings. 

Research 


Тће changes in selection methods described above have been based 
а continuing research program without which progress in test develo 
ment would have been virtually impossible. A central objective of tl 
research program has been to aid in the development of both acaden 
and non-academic (personality, leadership, professional, athletic) eriter 
of success at the Academy, and to discover the psychological correlate 
intellectual and non-intellectual, of such criteria. Since June, 1945, tl 

College Entrance Examination Board has been an extremely importani 
part of the test development and research. А comprehensive se 
tests, requiring about 20 hours of testing time, is administered to each n 
entering class during the preliminary summer term. Psy: choi in 


as used with Reserve officer candidates, have been found to be " 
factory (1, 2, 5). Beginning in July, 1943, the testing program at firs 
utilized commercial and Navy tests, in addition to those constructed 8 
the Academy (1). In the early stages of the program testing time Wa 
allotted on an informal basis. The experimental testing program has 
now developed to the point where it is formally inserted into the р! 
nary summer term schedule, and it makes use of many tests spe 
constructed by the College Board for the program. | 

The most important functions of the experimental testing program are 


(1) To discover which psychological characteristics, as measured bj 
the tests and interviews, are related to academic, non-academit 
professional, and athletic performance at the Academy. 

(2) To aid in the development of suitable criteria of success at th 
Academy. 

(3) To discover the best methods and items for measuring character 
isties which have been found to be related to Academy р 
mance, broadly defined. " 

(4) To discover relationships between the tests, Academy perfor 
mance, and subsequent behavior as a Coast Guard officer. 

(5) To pretest items and develop a library of test items for the en 
trance tests. 

(6) To compare entering classes with each other and with othe 
groups. 


Development of Entrance Tests 253 


(7) To check the consistency of research findings from year to year, 
and to keep a check on the effects that changes in curriculum and 
methods for dealing with Cadets may have on previous findings. 

(8) To develop scoring keys for tests of personality, interest, and 
background. 


This is not the place to show specifically how research findings have 
given rise to the development of tests and scoring keys, and how testing 
has produced further research activities. It is enough to say that the 
theoretical and practical aspects of the program are closely interrelated. 
It is also not intended to include research findings at this time. Studies 
are in progress of the relationships between test scores and various aca- 
demic and non-academic measures of performance at the Academy. In- 
tercorrelation, including factor analysis, methods are being used to study 
such problems. Test score comparisons and item analyses are being 
carried forward through bi-serial correlation and contrasting group 
studies. Criterion analyses constitute basic and important parts of the 
research program. It is hoped that the research will produce data of both 
theoretical and practical import. Thus far, this has been the case. 


Comment 


It must be apparent that such a program as has been described in 
this paper could not develop without the wholehearted cooperation and 
support of Coast Guard officials, particularly the Academy administration 
and staff. Most important has been the recognition that a program of 
this kind requires continued research. It is absolutely necessary to con- 
struct yearly entrance tests, and well-validated tests made up of pre- 
tested items could not be long forthcoming without such supporting 
research. It should also be pointed out that the research problems pre- 
viously mentioned still need much work in order to further their solution. 

The cooperation that exists between the College Entrance Examina- 
tion Board and the Academy should also be stressed. The psychological 
program cannot progress without proper testing materials, such as those 
now furnished by the College Board. The yearly entrance tests require 
the services of a staff of test construction experts and it is much more 
Satisfactory and economical to make use of a well-developed staff than 
to attempt to build one. The construction of the entrance tests is not a 
sufficiently large undertaking to warrant the permanent services at the 
Academy of a staff such as is available at the College Board. The College 
Board also gives invaluable aid to research projects, and furnishes statis- 


tical Services when necessary. 


254 Sidney Н. Newman and Joseph M. Bobbitt 


Summary 


The gradual evolution of the national competitive examinations for 
the United States Coast Guard Academy has been described. The tests 
and examinations have developed from completely subjective to com- 
pletely objective types. Scoring methods have been changed from per- 
centage grades to standard score procedures. The achievement testing 
has been broadened and aptitude tests have been added. Tests have 
been introduced to furnish objective data on those personality and leader- 
ship characteristics which are termed general adaptability. Computation 
of the final mark upon which entrance to the Academy is based has been 
changed considerably. The interrelations between research and applica- 
tion have been stressed. The cooperative effort involving Coast Guard 
officials, Academy psychologists, and the College Entrance Examination 
Board has been emphasized. | 


Received October 29, 1947. 
References 


1. Bobbitt, J. M., and Newman, 8. H. Psychological activities at the United States 
Coast Guard Academy. Psychol. Bull., 1944, 41, 568-579. 
2. Bobbitt, J. M., and Newman, 8, Н. Psychological study of factors involved in the 
selection of cadets. U.S. Coast Guard Acad. Alum. Assoc. Bull., 1945, 6, 296-305. 
3. Bobbitt, J. M., and Newman, 8. Н. The 0. S. Coast Guard Academy psychological 
program for regular cadets, progress report. New London: U. 8. Coast Guard 
_ Academy, 1945. Рр. уі + 76. Published for limited distribution. 
4. Felix, R. H., Cameron, D. C., Bobbitt, J. M., and Newman, S. Н. Ап integrated 
medico-psychological program at the United States Coast Guard Academy, # 
р i report. Amer. J. Psychiat., 1945, 101, 635-642. 
5. Newman, 8. H., Bobbitt, J. M., and Cameron, D. C. The reliability of the inter 
view method in an officer candidate evaluation program. Amer. Psychologist, 
1946, 1, 103-109. 
6. rd, ne In defense of military psychology. Amer. Psychologist, 1947, 2, 
7. Shipley, W. C., and Graham, C. Н. Final report in summary of research on the 
Personal Inventory and other tests. (OSRD, 1944; Publ. Bd., Хо. 12000.) Wash- 
ington, D. C.: U. S. Dept. Commerce, 1946. Pp. 39. 
8. ‘Tyson, R. Footnote to military psychology. Amer. Psychologist, 1947, 2, 21-22. 
9. Forty-sizth annual report ој the Executive Secretary. College Entrance Examination 
Board. New York: College Entrance Examination Board, 1946. Pp. xii + 105. 
10. Regulations governing appointments to cadetships in the United Slates Coast Guard. 
Revised June, 1940. U. 8. Treasury Department, Coast Guard. Washington, 
р. C.: U. S. Government Printing Office, 1940. Pp. 26. 
11. Regulations governing appointments to cadetships in the United States Coast Guard. 
Revised June, 1944. Navy Department, Coast Guard. Washington, D. © 
U. 8. Government Printing Office, 1944. Pp. 21. 
12. Regulations governing appointments to cadetships in the United States Coast Guard: 
Revised June, 1946. U. S. Treasury Department, Coast Guard. Washingto™ 
D. C.: U. 8. Government Printing Office, 1946, Pp. 21. 


Ап Analysis of Grievances and Aggrieved Employees 
in a Machine Shop and Foundry * 


Arthur C. Eckerman 
Division of Education and Applied Psychology, Purdue University 


The person working on the labor relations front in industry knows 
how slowly progress is being made in bringing labor and management 
to a closer understanding of their mutual problems, problems which are 
inevitably reflected in the endless grievances that must be processed. 
Any program which simply proposes a different method of handling griev- 
ances їз only dealing with symptoms; the real causes underlying the 
complaints of labor will lie untouched. 

Various discussions concerning the problems of present day labor re- 
lations led to the research represented by this paper. The hypothesis 
was developed that a statistical analysis of grievances might indicate 
significant differences existing between employees having grievances and 
non-aggrieved employees. 

A large Midwestern plant allowed this study to be made of its griev- 
ances and aggrieved employees. The name of the city, state, and com- 
pany is withheld, and the nature of the company's produets is unimpor- 


* tant to the research. Two unions had contracts with the plant, a machine 


Shop union and a foundry union. t 
Grievances at the plant were divided into two classes, oral and written. 
Аз по record was Кере of grievances in the first and second steps it was 
impossible to get an estimation of the number and the nature of these 
grievances. After the grievance had been reduced to writing in the third 
step a complete and accurate file was kept on it regardless of its DE, 
tion. Therefore, in this study of the grievances of the plant only those 
grievances were used that had reached the third step with subsequent 
reduction to writing. f 
This situation was fortunate from two standpoints. First, disre- 
garding steps one and two probably reduced the size of the study con- 
siderably and simplified it. Secondly, by not using the first two steps of 
the grievance procedure the results are probably more valid from the 
standpoint of being an accurate description of real labor problems in the 
* This article is based on the author’s dissertation of the same title submitted to the 


Faculty of Purdue University in partial fulfillment of the requirements for the degree 
9f Doctor of Philosophy, February, 1948. The dissertation was directed by Dr. Joseph 


255 


e 


256 Arthur С. Eckerman 


plant. Grievances in steps one and two may perhaps more correctly be 
described as complaints of employees. Only when these complaints are 
found to be real differences of opinion between the thinking and the pro- 
gram of the union, on one hand, and the thinking and the policies of the 
company, on the other, may they be considered true grievances. At 
this stage they are formalized by writing and are taken out of the hands 
of operating supervision and operating union officials alike to become 
matters of genuine concern of the managements of both the union and the 
company. 

The research was undertaken with no thesis in mind; there was no 
thought of proving any preconceived opinions, either unionwise or com- 
panywise. The only hypothesis was that if significant differences exist 
between aggrieved and non-aggrieved employees, this type of research 
might identify and describe those differences. 

It is hoped that this research will be of some help to American labor 
and industry in their ceaseless striving to arrive at a better understanding 
of their mutual problems. 

Procedure 


A survey of the grievance files of the plant revealed a source of data 
complete in detail for each grievance and in chronological order. The 
personnel records of the plant were also in excellent order and quite 
complete. 

A work sheet was made up which contained two sets of data, (а) the 


pertinent facts of nine items of the grievances, and (b) all available in- | 


formation concerning the aggrieved, which consisted of 75 items. From 
the very complete grievance and personnel records of the company 1067 
work sheets were filled in which represented 766 separate grievances of 
327 employees. A number of employees, mostly union officials, each 
filed more than one grievance. The average number of grievances per 
employee of the group having grievances, was 2.3. The first grievance 
filed by an employee was designated as an “initial grievance.” Table | 
shows the distribution of grievances and grievers. 


Table 1 
Number of Grievers and Grievances 
Grievers 223 104 327 
Initial 150 92 242 
Other 73 12 85 
Grievances 644 122 766 
Initial 223 104 327 
Other 421 18 439 


Grievances and. Aggrieved Employees 257 


In the foundry agency 223 initial grievances were found and 104 in the 
machine shop. Work sheets were made up for two control groups con- 
sisting of 201 foundry employees, selected at random from the personnel 
files, who had not filed a grievance and 100 machine shop non-aggrieved 
employees also selected at random from the personnel files. 

Items on the work sheets were coded, and the resulting information 
was punched on I.B.M. cards. Two sets of data were tabulated from the 
punched cards, data on the grievances and data on the grievers and the 
control groups. The foundry data were separated from that of the ma- 
chine shop. Each bargaining agency then had two sets of data, grievance 
information and personnel information. The grievance data had two 
divisions, that of (a) initial grievances and (b) other grievances. The 
personnel data also had two divisions, (a) aggrieved and (b) non-aggrieved 
employees. 

A statistical analysis was made of the results of the tabulation. The 
figures were expressed in either per cents or medians. A number of items 
such as vacation, yearly income, and others were calculated only for a 
twelve month period, the calendar year of 1946. АП figures are compara- 
ble, having been equated to take care of variables introduced by a general 
wage raise. 

The difference between each of the respective groupings of grievance 
and personnel data was computed. The standard error of each difference 
was also computed. Fisher’s £ statistic was computed by dividing the 
difference by the standard error of the difference. An entry from Fisher’s 
table was then obtained for each # value and the probability P determined. 

Each ¢ value is indicative of a level of significance which may be in- 
terpreted as the probability that a difference as large as the obtained 
difference could have occurred if the samples were drawn from the same 
population, or to put it another way, as the probability that the difference 
could have occurred by chance alone. 

If a difference as large as the obtained difference could have occurred 
as frequently as 5 times in 100 among pairs of samples drawn from the 
same population, the difference is considered significant at the 57% level. 
Since chance alone could account for a difference as large as the obtained 
difference only 5 times in 100, the null hypothesis that the true difference 
18 zero can be rejected. A difference significant at the 57% level or lower 
'Smarked on the tables with a double dagger. 

If a difference as large as the obtained difference could have occurred 
as frequently as 10 times in 100 among pairs of samples drawn from the 
Same population, the difference is considered significant at the 10% level. 

difference significant at the 10% level is marked on the tables with a 
dagger. Ifa difference as large as the obtained difference could have 


° 


258 Arthur C. Eckerman 


occurred as frequently as 20 times in 100 among pairs of samples draw 
from the same population, the difference is considered significant at t 
20% level. A difference significant at the 20% level is marked on t 
tables by one asterisk. 
These asterisk markings of the three levels of significance are made t 
facilitate a rapid identification of the most probably significant items, 
The results of the analysis of data concerning grievances is expressed 

by comparing the respective standings of union members and union € 
cials on each item. “Initial grievances” are those filed by union membe 
on their own behalf, “other grievances” are those filed by union officials 
usually for a cause furthering the union’s program or the operation of t 
agreement. | 
The results of the analysis of all data concerning aggrieved employe 
are expressed by comparing the respective standings on each item 
the grievers and a control group of employees having no grievances. 
Only those items were used in comparing the respective groups whe 
the number of cases involved was large enough for statistical handlin 
Differences between the groups of grievers and non-grievers that are pre 
ably most significant are those in the ¢ column of the Tables which а 
marked with three asterisks, Values of ¢ which are marked with 
asterisks might be considered significant, but as these values decrease the 
are to be interpreted with increasing caution as chance factors are Mor 
apt to be responsible for the difference as ¢ values become smaller. 


Results 


Grievance Data of Foundary and Machine Shop Employees. Ni 
items concerning grievances were available. Several of the items, 800 
as the classification of grievances and the disposition of grievances, 08 
subdivisions. Only those items are shown in Table 2 which had 1а 
enough numbers of cases to justify the computation of differences. о 

a. Union officials, as reflected in the grievances they file, do not rel 
to the contract in the wording of their grievances as often as do uni 
members. This difference between the two groups is probably great 
in the foundry unit than that of the machine shop. К 

b. Relative to the nature of grievances more grievances appeared 
be filed by union officials concerning work and jobs than by union ше 
bers, particularly in the foundry unit, but the differences are not sign 
cant. Union members file more grievances concerning pay and W8 
than do union officials, However, of the grievances concerning job # 
work or pay and wages, union members’ grievances show a larger Ре 
centage of the latter. This is particularly true in the machine shop WM 
the ratio is approximately one to three. Union members in the foun 


Grievances and Aggrieved Employees 


тәлә гопоручоо %¢ eq уз 3uvogra2ig { 
"вупоолод ut роввалахо әгә ојаву STUY ut зунр еопвлоши) | 
"әлә әәчәручоә % Oz eq 38 FUBITUTIS „ 


or s? + 00 se 10° 9p и 4 L6 пош Ач peddoxqq 
«1 [2/22 eee 9°09 9r 80 – 529 v9 Хазашогу Aq porueqt 
ог rig- L99 9'cg 00 00 ree ree Хава шор Aq pojusr) 
9ouwAeLm) jo uorsodsq 
zy Lov 99. #81 vo gut 88 9% deg mni 
os" 9*91— yt 8'87 0c Tı + 6'05 vc deg что 
Ir 61 + oos 619 98 Lg 8'69 S'EL dS PNL 
ur payag usann 84915 
se re + ги coc «СРТ еп+ OF £'er Ayuoruog 
"y 9F + yv (6r 19 et + osz 862 вода pus Аза 
19° е su VI сет 8'8 — 9'eg 8% од pus qof 
вәопвләмгу JO UOTPBOYISSB[D, 
og" 9s — 9'eg 0'08 „6271 g'or res 922 0 рәлләјәҹ JON 3913000) 
1 soouo suL воопи 1 goouo ваоив sauv 
-apq -ләшсу -Аәпгу мета “дођу -Аәшгу 
VO Pom 19:90 TAUI 
doyg әшцову{ Arpuno,y 


] seeAojdury doyg ourgovyy pus Arpuno,y Jo увс ооивломо 


с 91481, 


260 Arthur C. Eckerman 


file more grievances on seniority than do their officials. The difference 
is not significant in the machine shop where the same relationship appears 
tohold. A higher percentage of machine shop grievances of union mem- 
bers are concerned with seniority than among the foundry group, although 
union officials of the machine shop agency are more concerned with sen- 
iority problems, as reflected in their grievances, than are the foundry 
union officials. Percentage figures ón the other six items of the classifi- 
cation of grievances may be found in Table 3. 

c. In the settlement of grievances no significant differences were found 
between the subject groups. Grievances settled in the third step had 
values in the same direction, in favor of the union members, in both the 
foundry and the machine shop. The number of grievances involved in 
the fifth step was too small to make a reliable comparison. The sixth 
step, arbitration, also had too few cases, there being only four with which 
to make any comparisons. 

d. The machine shop grievances of union official: are definitely 
granted by the company more often than those of union members. There 
is no difference between these two groups in the foundry. This would 
seem to indicate that union officials of the foundry unit are not as able in 
formulating and processing grievances as are the union officials in the 
machine shop. The converse is likwise true in the machine shop unit, 
more grievances of union members are denied by the company than are 
those of union officials. It is indicated that in the machine shop, the 
grievances that are dropped by the union are those of its members and 
not those of its officials, but the difference is not significant. In the 
foundry an equal number of grievances of both groups seem to be dropped 
by the union. 

Personal Data of Foundry and Machine Shop Employees. Seventeen 
items of personal data were available on the employee’s personnel record. 
These data were compared for the two groups, the grievers and the non- 
grievers. Table 4 gives the figures on the comparisons for these two 
groups in the foundry and in the machine shop. An analysis of the re 
sults obtained, when the personal data of aggrieved employees of the 
foundry and the machine shop were compared with that of their corre- 
sponding control groups, showed several significant differences between 
the two groups, the grievers and the non-grievers. 

a. Little difference was found between grievers and non-grievers M 
the foundry and machine shop in regard to education. More of the 
machine shop grievers went further than the eighth grade than did the 
non-grievers. . 

b. It is indicated that foundry grievers are socially more stable in that 
fewer of them are single, more of them are married and more of them have 


"упошоолве үвпүпш 10 Sururvs€eq олцоооо 10у $19)39]N —6 
ssoursnq Хива шој—8 ләривгу рие попошоад —р 


сече a 


L 9 9° * £ с 1 899U'WAOLID) 


Grievances and Aggrieved Employees 
| 
| 
8 
5 


| uonyvogrsee[;) Aq вәойзләцу jo 1oqumv 


5 91481, 


— ——————————————— e—————————— С ра 
"PAepe»uopguoo 959 oy} уз yuvogri3iS | 1 

"Paa гопорупоо 9501 oy} 78 yueogiusig | ` 

әлә әәчәргүдоо 950g eu 4B JUBOyTUAIG „ 


9L FL- seg РВР occ [4262 818 ссд (9%) чођвоцайу jo eur, 4e pe&ojdurq 
tez g'6z+ с18 999 104 €8r ET GB (ву уто UPIN) uo eur, 16301, · 
tze FI + т Lg 19'9 et + ог gv (ходати у ирд) pojsrT sqof 
urn uonvoqgddy 
ec 8T – 0'89 2'99 95'1 Lett 6'89 9'18 пзләўәд-по\ 
61 ez + 06 єтї Ра $9 — *'oc 6`ёТ sjuopuede(q 10940 
02 L9 + oog 1798 19% 691+, 89 1789 upro 
TI 92 + O's 9'28 т L914 191 26 PIEN 
89 98 — 0'22 РЪТ 181 LS91— vec L9 ejus 
E (25) 5081S TEPU 
„бӯ + 0'£9 LL £6" v9 + [972 Ere әре 418 1940 
5 0071 Lt- 02g Га 26° yo — us 19 ; ләрп[ү рив ops15 438 
Н (25) Зацооцов 
© toe t8 — с [X71 1sg'9 88 = Ds vor qnos 
5 oer 99 + ог 9'6 gr лт + сө LL (eyes PXA) YHON 
$ c 9s + oze 909 сот с6 + он eec (EHO ох) 989S 
~ eg 69 — тоё сет Sc co FIZ бї Апр 
(ш поя %) ова лег 
86 1 + 8'991 USeI eI re + £981 FELT CUPIN) spunog ur 49219 M 
1g со + c'69 ¥69 ZT #0 + 69 %'69 CUPIN) seqoug ш 3q2torr 
174 со – €'9£ 097 gI РО – оте 9'eg CUPI) влвод ш әу 
1 ваопо вәләмгу SIANEL) 1 ваопе влолошт вдәләцгу 2: 
=> а ом “Py том 
SE TES eee 
doyg әшчәврү Алрипод 


вог Хоја ок doyg әщцовуу рив Kipunop јо je үєповдләд. 
т 9Iq*L à 


262 


Grievances and Aggrieved Employees 203 


children. A greater number of the foundry grievers have children than 
do the machine shop grievers. As the needs of a family group are greater 
than а family without children and as grievances filed were primarily for 
more money, it would seem to follow that in the two machine shop groups 
employees with families should have more grievances. Differences run 
in the same direction as those in the foundry, but they are not significant. 

c. On the application form of the company, applicants hired who 
later filed grievances had had more jobs and had worked longer than ap- 
plicants who became non-aggrieved employees. Among the foundry 
group, more of the subsequent grievers had jobs when they applied for 
work at the plant than did the subsequent non-aggrieved employees. The 
reverse was true in the machine shop group where more of the non- 
grievers had jobs at the time of application, but the differences here were 
not very pronounced. Two reasons might be postulated why foundry 
workers who have grievances were looking for different jobs, (a) due to 
more of them having families they were in need of better paying jobs, 
(b) as a group they appear to be less stable socially than non-grievers. 

d. A greater per cent of employees born in the South were non-grievers 
rather than grievers in both the foundry and machine shop. А study of 
other places of birth revealed no significant differences between grievers 
and non-grievers although all differences were in the same direction for 
both the foundry and machine shop. 

e. No appreciable differences were found between the grievers and 
non-grievers with respect to weight, height, or age. This is true of both 
the machine shop and foundry groups although the differences found were 
in the same direction for both groups. 


Personnel Records Data of Foundry and Machine Shop Employees. E 
the numerous items taken from the personnel records, Soon ST dde 


proved to be most interesting. 


а. More foundry grievers had personnel transactions than non- 
aggrieved employees. This indicates that foundry employees who are 
subject to personnel changes, regardless of the nature of the transaction, 
are most liable to be grievers. 

„b. Aggrieved employees have more total net service than do non- 
Brievers. The machine shop grievers have over two years more service 
than non-grievers whereas the foundry grievers show approximately one 
Year more service than foundry non-grievers. 

©. New employees who later became aggrieved for one reason or an- 
other started at about eighteen cents an hour less than employees who did 
not have grievances. This is true in both the foundry and machine shop. 
However, it is indicated that at the time of the grievance, the aggrieved 


~ 


Arthur C. Eckerman 


хл oor + оп [174 1219€ 975 + YZI ose (95) suonjovsuvi], үәчповдәд 
or $t + oz [27 „РТ TH + OL ri 8)qaq 91045 рә) 
sr 0t – oz от I8" ©» + 09 Gl sjuourqsmer) 
90 s + [x Ө «991 си + 09 coc 8194491 UN 

(25) Burpavig этрәлгу 

fuv vee – 0'68 €'9g MIA I$ — 9'eg ad (25) gokwy &re1odure], 

ose! sor + Ogi 8'62 1995 vee + 6'81 g'9g (25) BO PeT seoXopdurg 
e ez + oes 216 мл go – 0'26 €'98 COMUNE 
[74 re = 08 6% 16° 98 + o+ сат PPN 
oc $t + oe 89 Le 95 = °% 6'0 ту 

(25) орвло 10qw'T ur попво 
өт с -= osr st 1£61 19 — 828 тте peypysan, 
ту ee + 0'e9 289 1995 oor + [a] с19 -'perpisnuog 
96 єп + 091 6125 60° 80 + вот ги peris 

(0) вәәХорйш jo ary [IPIS 

ет tt + 0'86 2'66 5905 бст + 6'62 s (25) seo£opdurg M 
6s" 902 + 10/2 1062 0 . 82 – 8202 orog UV19j9 A -UON 
sont +g- 109% 1972 66' 841— вета 1922 чвлојод 

09 99 + 9197 [1774 кї 89 + 2182 1882 ($ 'UPIN) ваотиле у qyuopy zr 

1957 Ws и" sr tere Ө + og" Sr ($ ард) usang 

Jo 97801 оў osvoroup FBM [v30], 

£ о T 9071 6071 ог 17 + 607 Ort ($ UPIN) әәивләнгу jo OUNT, ув IVY 

tuz їй = w o9 toe'or eT = [7A zo" (8 UPIN) әјву 301136 

tzos 69% + т ово tore Уп +: БУТА 268 CUPIN) “OW ш IAG до Тело, 

1 вәочә #зәләшгу saour) , goouo BIANO вдәләшгу 
-pa ом эра юм 
doqg ourqgouyy Атрипод 


mum 
səojdurg doyg əmyowpy pus Атрипод Jo зун] spa009jp [ouuoszoq 


€ 91481, 


—P ча 


Grievances and Aggrieved Employees 265 


employees were slightly higher in hourly rate than non-aggrieved em- 
ployees, although the difference is not significant. 

d. The total wage increase of the grievers from the time of hire to the 
time of grievance was significantly higher than the wage increases recieved 
by the non-aggrieved group for a corresponding period. It is demon- 
strated by this study that both the union and the company were cooperat- 
ing in erasing wage differences between workers, the union in pointing 
out the differences and the company in equating them. 

е. The grievers as a group were more subject to layoff than were the 
non-grievers, particularly in the foundry. However, when it came to 
temporary layoffs, over twice as many non-grievers took temporary lay- 
| offs as did the grievers. This would seem to indicate that the plant 
supervision was well aware of problems it could make for itself, but on the 
other hand it has been noted that grievers have considerably more net 
service, hence more seniority, than do non-grievers. 

f. In the matter of skill level of the respective groups, the semi-skilled 
bracket of the foundry had the highest percentage of grievers in it. The 
most grievers fall in the semi-skilled bracket of the foundry and machine 
Workers, It is indicated that the semi-skilled level of foundry employees 
is most liable to produce grievers. 

g. There seemed to be little difference between the annual earnings of 
the two groups although it is indicated the grievers in the foundry earned 
slightly more money in the year of 1946 than did the non-grievers. In 
comparing the annual wages of veterans and non-veterans, it is indicated 
that in both the machine shop and foundry veterans who were non- 
stievers earned more per year, than veterans who were grievers, par- 
ticularly in the machine shop group. The same relationship exists though 
пој significantly in the non-veteran group. 

h. A study of the credit standing showed that of the foundry group 
the company got many more dun letters on employees who were grievers. 
It is indicated that more garnishments are served on foundry employees 
Who are grievers, but the opposite was found within the machine shop 
group, although neither difference is significant. From information the 
Personnel department receives from credit stores, it would seem that 
srievers are more heavily burdened with debts than аге the non-grievers, 
Particularly among the foundry group. | 

i. The position of each employee in his respective job class or labor 
Stade was studied. In the foundry group more non-grievers had attained 
Maximum position in their job class. The opposite was true in the 
Machine shop, but not notably so. Of foundry employees in the middle 

their job class, more were grievers while again the opposite held true їп 
* machine shop group in: about the same small proportion. Little 


266 Arthur C. Eckerman 


significance сап be attached to the few cases found in the minimum. 
position. 

Medical and Welfare Data of Foundry and Machine Shop Employees. 
The eighteen items on medical and welfare data in Table 6 show several 
distinct differences between the subject groups. 

a. From the standpoint of medical classification, machine shop 
grievers are more healthy or less handicapped as more of them fall in 
“Class A" which means unrestricted job placement. However, the 
grievers, particularly in the foundry, visit the dispensary more often 
than do the non-grievers for medical attention other than for accident 
care. A greater number of grievers file for disability benefits from the 
Employes' Benefit, Association than do non-grievers and a greater number 
of grievers collect disability benefits. "This probably explains why more 
grievers belong to the Employes' Benefit Association than non-grievers, 
particularly in the machine shop. The two groups of grievers are also 
in the dispensary more frequently with shop accidents for which they 
have a much much higher percentage of claims for accident benefits than 
do the non-griever groups of both the machine shop and foundry. 

b. Although the majority of employees subscribe to the group life 
insurance plan of the Company, fewer grievers in the foundry have mem- 
bership in the plan than do non-grievers. The reverse is true in the mā- 
chine shop where more grievers hold group life insurance. Grievers in 
both the foundry and machine shop subscribe much more heavily to the 
group savings plan which is a credit-union or lending agency, than do 
non-grievers. Group hospitalization, on the other hand, shows an Op~ 
posite picture. Where practically all non-grievers carry the group hos- 
pital plan, less than ten per cent of the grievers have hospital plan mem- 
bership. It might be suggested that grievers, as a group, have a greater 
feeling of insecurity or а poorer management of money in that more 
them use the savings plan. Little, however, is known of the savi 
habits of either group outside of the company plan. 

n Foundry grievers worked more days in 1945 and 1946 than did n 
grievers and received more vacation credit. In the machine shop group: 
however, the grievers did not work many more days in the year th: 
non-grievers, but they received a great deal more vacation eredit. Т is 
is explained by the greater length of service of the group constituti 
machine shop grievers. 3 


Summary and Conclusions | 

Ап attempt to make a contribution toward a better understanding 
the problems of labor and management, as reflected in aggrieved em- 
ployees and their grievances, was made by a statistical analysis of 


5 | ләләр әопәрупоо 956 ar 48 quvogru?ig $ 
"дај oouopguoo 950r oy} уе 3uvogruzrg | 
прлој oouapguoo %OZ oy} ув узот „ 


ю со + от її 60" 80 + $6 eor (%) sygouog surely үвїгүвприү 

iore 971+ ел see 1961 ry + [X11 ESI Счрр{) syuoprooy 4048 “ON 
„821 oF + ог 96 166% L9rT eL със (95) “wag 3uoprooy 107 ше) Ў 

4681 + 092 Vly too'e Sot 616 СӨР (95) “dag xotg 10] SUJO 

102 022+ 082 0'0€ 168% voc vv 8'69 (95) тозу вуцепод 

1007 012+ 082 067 1629 082+ 6% 62S (%) Аутават 10 "шогу 

8071 29 + гї vic iv “eit от 176 (CUPIN) “ает оў SHSIA "per 
"ова yeueg „во Хоја от 

99' oF + 68% Erg 108'2 91+ ceo 8h CUPIN) PONIO M ќе 

isr'oz Lv + s <от 196% gu 99 [x CUPIN) poure;p uonsouA 

H aA + 09" c9 Se eo + 9c 89 CUPIN) Aprure,p ur sson “wep 

£r #2 – 0'28 191 gr Ре + 9'08 078 (95) suosvay үесовләд 

801 001+ 918 9'19 1804 8'19+ 61 602 (95) Аатав=тС үечовләд 

N 60 о + бї 161 9$ Үт 6'g6 $26 CUPIN) eouesqy s£eq 
E :squopy ZT 3891 

196'8 126— 0'66 6'9 119°@1 у16— $86 ТА чет peyrdsog dno 

Aot го + 0'06 U96 101 те + 9'28 0'16 "ussy 3geueg ,soKo[durg 

„191 гат+ 02 re 106'8 9'094- 08% %'28 чаја 58495 

4 «ЭРТ rý + 0%6 186 lorc py 9'86 076 оед әоивлвиү ory поло 
5 :(%) ut drgsrequieyy sooKopdurg 
torz cort oes c6 8571 6T + "28 ӯс6 (0) «Wr чопвошв5вјо TOPAN 

1 возио влолошо saun 1 воопо SI9AOLID) SAIH) 
"ои "том ои ом 
doqg ouros Алрипод 


зоо Хоја ту dog eurgowjy риз Хлурипоу Jo 848 ледом pus үвогрәрү 
9 91481, 


268 Arthur C. Eckerman 


grievances and their makers of a large Midwestern industrial plant. The 
plant had two unions, a foundry union of five and one-half years existence 
by the end of 1946, the period covered by this study, and a machine shop 
union thirteen months old prior to December, 1946. Foundry and 
machine shop data were studied separately, the grievances of each being 
separated into two groups (a) initial grievances, those filed by union mem- 
bers and (b) other grievances, those filed by union officials. Only griev- 
ances that had been reduced to writing were used in the study. A group 
of non-aggrieved employees was equated with the aggrieved employees 
as controls and 53 items of personal and personnel data of both groups 
were compared for both the foundry and machine shop agencies. 

In general, relative to grievances, the older foundry union and the 
machine shop union did not differ in many respects. Results of the 
study of grievances showed: 


а. The most frequent grievances are filed for pay and wages (30%), 
the next largest group of grievances concerned jobs and work (28%) with 
grievances concerning seniority coming third (10%). 

b. Union officials filed the highest per cent of grievances on matters 
of jobs and work; union members filed the highest per cent of grievances 
on seniority and pay and wages. The majority of grievances filed did not 
refer to the contract in any respect; of those which did refer to the con- 
tract, union members and not officials were the more numerous. How- 
ever, in these items concerning grievances the differences were not signifi- 
cant enough to warrant any conclusions. 

c. Only in the machine shop was a significant difference found in 
grievances granted by the company. Here union officials had more of 
their grievances granted than did union members. 

d. This study analyzed 766 separate grievances of 327 employees. 
Tt was indicated that grievers have held more jobs and have worked longer 
than non-grievers and more of them, in the foundry group, had jobs at 
the time of application to the company than did non-grievers. The 
group of grievers was found to have worked longer for the company than 
had the non-grievers and had accumulated more seniority, particularly 
in the machine shop group as shown by vacation earned. 

e. Grievers started at a significantly lower hourly rate than the non- 
grievers, but were equal at the time they filed their grievances. 

f. Grievers had received much larger wage raises than non-grievers. 

2. Although the annual earnings of the two groups were approxi- 
mately the same, grievers showed a higher skill level than non-grievers 

more of the machine shop grievers had reached maximum position in their 
respective labor grades, the opposite was true of the foundry group. 
h. The credit standing of grievers probably is lower than non-griever 


Grievances and Aggrieved Employees 269 


as the grievers, particularly in the foundry, had more dun letters in the 
со files as well as having been served a few more garnishments. 
_Ав far as the company records went, from demands made on the company 
by eredit stores, the foundry grievers were more in debt. 
1. Grievers, as a group, go in very strongly for the group savings plan 
‘at their plant, a credit-union or lending agency, yet very infrequently 
participate in the group hospitalization plan. 
= j. More non-grievers, in the foundry, have membership in the group 
life insurance plan; the opposite is true in the machine shop where grievers 
ppear more interested in life insurance. 3 
К More grievers subscribe to the Employes’ Benefit Asociation, and 
this is definitely indicated for the group paid many more visits to the 
‘dispensary for medical reasons as well as shop accidents. More grievers 
collect benefits for sickness and accidents as well as compensation for 


ап do non-grievers. 
1. It was indicated that grievers, as a group, are in better physical 
Condition than non-grievers. 
_ m. More grievers are married and have children than non-grievers, 
‘Particularly in the foundry. 
п. Of employees who had been born in the South the larger per cent 
"жеге non-grievers. 


. Evidence is presented which demonstrates that employees of this 
particular Midwestern manufacturing concern show significant differences 
- When divided into two groups, one composed of aggrieved employees, the 
other of non-aggrieved employees. The study simply points out the 
"degree of difference between the two groups on various personal and · 
үре sonnel items; it does not propose to explain the reason for the differ- 
ences found. In order to do this two approaches might be necessary, 
б nat of opinion research built around the significant items, and secondly, 
. & sound clinical study might shed light on some of the reasons grievers 
Appear to be а different and possibly less stable group as reflected in their 
medical, accident, and credit records. An analysis of grievances such as 
is here presented might be of aid to both supervision and union officials 
alike in finding where their problems lay. The results may definitely be 
Worked intoa training program of both groups. The study demonstrates 
ae t data concerning grievances and their makers are easily subject to 
: ‘Statistical analysis. It is hoped that the methodology used in this in- 
. Vestigation will stimulate further research on a broader industrial basis 
And that the results here obtained will bring about a better understanding 


of the problems of industrial employees. 


Additional Distributions of Test Scores of Industrial 
Employees and, Applicants 


Myles H. MacMillan 
Ingersoll Steel Division of Borg-Warner Corporation 


and 


Harold F. Rothe 
Stevenson, Jordan & Harrison, Inc., Chicago, Illinois 


In an earlier paper data were presented to show that applicants for 
industrial jobs often make a distribution of employment test scores that is 
different from the distribution of scores on the same test made by the 
employees against whom the test had been validated. That is, the dis- 
tribution for applicants is shifted toward the higher, or better, end of the 
scale. Three possible variables in the testing situations that may account 
for this shift, namely age, military experience with tests, and combining 
office and shop applicants’ data, were controlled in the previous analysis, 
and were shown to be unrelated to the shift. One other factor was partly 
controlled. This was the possibility that “the word gets around" among 
the supply of potential applicants with the result that only the “better” 
applicants apply. Another suggested reason was that the incentives to à 
good test performance were higher for applicants than they were for the 
employees who had been promised that their jobs would not be affected 
by their test results. It was concluded that this latter phenomenon was 
the reason underlying the shift. 

Some additional data that are relevant to this problem have been 
collected and it is the purpose of the present paper to present these an 
to relate them to the problem. It is believed that these data lend further 
support to the hypothesis that the reason for the shift is one of greater 
test-taking incentivation, and not one of the word getting around.” 


Discussion 


If the hypothesis is true that the word gets around and attracts better 
qualified applicants, it appears logical to assume that the word takes some 
1 Rothe, Н. F. Distributions of test scores of industrial employees and applicant 

J. appl. Psychol., 1947, 31, 480-483. 


з Е, L. Stromberg. Testing programs draw better applicants. Person. Psycho 
1948, 1, 21-29. 


270 


T'est Scores of Industrial Employees and Applicants 271 | 


time in getting around. "That is, the improvement іп applicants' quali- 
fications should appear gradually and not all at once. Thus, the im- 
provement should be a gradual one and successive samples of applicants 
should show successively higher distributions. 

On the other hand, if the reason for the shift is one of incentivation, 
the shift should appear suddenly in a first sample of applicants, and suc- 
cessive samples of applicants should give the same distributions as the 
original sample of applicants, all being equally higher than the original 
distribution for employees. This, of course, depends upon the samples 
being small enough to reflect any shifts that may occur. At the same 
time the samples should be large enough to give statistical significance to 
any results that are analyzed. 


Data from One of The Original Plants 


Data are available from one of the original plants for one of the tests. 
The data presented here are the only ones that are available from the 
original situations because of decreased employment (with the decreased 
turnover) in those plants and because in one instance another test has 
been substituted for one of the original ones. 

The Code Identification Test was originally validated against fifty- 
sixemployees. The first follow-up analysis was made after 129 applicants 
had been tested and a shifted distribution was found. A second follow-up 


15 Applicants 


коюы кол со-› 


9 
H М its 
č? 
v6 
i 5 

А 129 Applicants 

2 

1 

| жезл 

4 

3| 56 Baployees 

2 

1 

32. 9 10 20 » шо 50 se 
Ко, 1 


* Test scores of industrial employees and applicants on Code Identification Test. 


- 


272 Myles Н. MacMillan and Harold F. Rothe 


was made when seventy-five more applicants had been tested. The 
distribution for this second group was practically identical with the dis- 
tribution for the first group of applicants. "These three distributions are 
shown in Figure 1. 

These data are summarized in Table 1 where it can be seen that the 
Critical Ratio between the means of the two successive groups of ap- 
plicants is 0.12. 

Table 1 
Test Scores of Employees and Applicants 


————— 


Group N х 8.р. св. 
n MEME. DT IÓÉ 
Employees 56 27.8 15.33 
Applicants—1 129 34.9 Bsh 2.97 
Applicants—2 75 35.1 11.49 0.12 


It is apparent that either the word got around immediately, once and ` 
for all, or else some other variable was operating. It is concluded that 
another variable, the incentivation of the various groups, explains this 
shifting of applicants over employees. 


Data from Other Plants 


Data are also available from another plant in which the Wonderlic 
Personnel Test was administered routinely to all applicants. Some ap- 
plicants were white and others were colored. The presence of a large 
group of colored applicants who live in one area of the city permits an 
ideal situation for the word to get around the normal labor supply of this 
plant if such is to happen at all. 


Table 2 
Test Scores of Negro Applicants 
Group N »d S.D. 
1 100 9.9 7.92 
2 100 9.0 6.23 
3 100 84 7.19 


Eo m CERERI Т UT oy АВРО: 


The first three successive groups of one hundred applicants each, 
negro and white, were analyzed and are summarized in Tables 2 and 3. 

The critical ratios are, for groups 1 and 2, 0.84, groups 2 and 3, 0.6 
and groups 1 and 3, 1.35. It is especially interesting that these successive 
samples of applicants showed lower, not higher, mean scores. 


Test Scores of Industrial Employees and Applicants 273 


word got around here, it had a negative effect. This is contrary to the 
expectations of those who believe that the use of tests will automatically 
improve the qualifications of applicants. 

The critical ratios for the white applicants are, for groups 1 and 2, 
3.72, groups 2 and 3, 1.14 and groups 1 and 3, 2.45. Here again the shift 
was downward. 


Table 3 
Test Scores of White Applicants 

eee 

Group N х9 то (SD. 
io o o o E 

1 100 187 8.57 

2 100 172 8.52 

3 100 17.7 9.16 


The same procedure of testing applicants before attempting to validate 
the tests was used at two other plants. In both instances standardized 
general intelligence tests were used. In one plant the second group of 
applicants had a higher distribution than the first group, with a C. R. of 
40, and in the other plant the distribution shifted downwards, with a 
С. R. of 2.37. There were about 150 persons in each sample in both of 
these plants. 


Additional Controls Needed 


All of the above data point to the same conclusion, namely, that the 
use of tests has no effect on the qualifications of applicants. The mere use 
of tests does not attract “better” applicants. It is probably true that in 
a few instances some applicant may state that he has come to a specific 
plant because he heard tests are used and only intelligent people work 
there. But these are isolated instances, if they do occur, and are of no 
statistical significance. 

The data presented in these two papers have been collected under 
actual industrial employment office situations. There are still some con- 
trols lacking before this problem can be solved in an experimental manner. 
For example, there is the possibility that the various test administrators 
may have, for some unexplained reason, affected the various testees differ- 
ently. This possibility has been completely uncontrolled although in all 
instances standardized procedures were supposedly used. Nevertheless, 
the personalities of the administrators may have affected these situations. 

It would be desirable to test a group of applicants and at a later date 
to re-test all those who had been hired. If the conclusion reached in 
these papers is correct, it would be expected that the re-test distributions 
Would be lower than the employment office distributions for the same per- 


- 


274 Myles Н. MacMillan and Harold F. Rothe 


sons, The writers have heard of one instance where this was done, 
the results described above, but no data are a vailable to them on. 
point. 1 

Another point is that the employees described in these papers w 
tested in large groups and the applicants tested either individually o 
very small groups. Perhaps, the employees could have been more hi 
incentivated if they had been tested individually. On the other hand, 
possible that there was some social effect that did indeed lead them te 
relatively high scores. If so, this group effect was apparently по 
effective as was the incentive of a job that was held up before the 
plicants. F 


Conclusion 


The conclusions from present data are substantially the same as 
conclusions of the original paper. When tests are validated against 
existing employee force, a follow-up analysis is needed in order to 
the critical score, if a critical score is used. The mere presence 0 
in the employment office does not guarantee that applicants will be 
highly qualified than are the present employees. The tests must b 
dated for the jobs in question. The greater test-taking incentiv 
applicants appears to account for the shifted distribution of their 
as compared with the distribution of employees’ scores. 


Received April 29, 1948. 


On the Validity and Reliability of the Job 
Satisfaction Tear Ballot * 


Willard A. Kerr 
Illinois Institute of Technology 


The Tear Ballot for Industry, General Opinions, was developed in 1944 
for the purpose of attempting to measure job morale or job satisfaction. 
Items were obtained from examination of the psychological and personnel 
literature, and each item was subjected to the critical appraisal and re- 
vision of a panel of five industrial psychologists. Each word in each re- 
tained item was checked for acceptability at low vocabulary level against 
the Thorndike (3) word list. 

As this test utilizes the original tear method of response deseribed 
elsewhere (1), a saving of from ten to twenty per cent in administration 
time is effected by elimination of the necessity of distributing and col- 
lecting pencils. In the industrial situation, particularly when testing is 
done on company time, this saving becomes significant. A second im- 
portant advantage of the tear method is that the employee is impressed 
with its obvious anonymity when he finds that he is required to write no 
identification nor even any pen or pencil marks which might establish the 
identity of his replies to the questions on the tear ballot. 

An average employee can answer all questions and cast his anonymous 
ballot within two to three minutes. Acceptable reliability is obtained 
not by having a burdensome number of questions, but by utilizing the 
five-point continuum in answering, which, according to the experiments 
of Remmers and Adkins (2) and others, yields higher reliability than the 
use of fewer than five response alternatives. The addition of alternatives, 
in fact, increases reliability as predicted by the Spearman-Brown formula 
until at least a limit of five is reached. Addition of alternatives after five 
18 reached seems to extract such sharply decreasing returns in increased 
reliability that the increase in administration and scoring time is probably 
not worthwhile. 

Validity 

It seems to be believed generally that the problem of validating a job 

Morale test or any other type of attitude scale is a very difficult one. This 
* The author acknowledges the invaluable assistance of D. R. Elrod, G. R, Hugman, 


. K. Kohler, W. A. McNichols, Jr., and Mervin Rudolph in completing these studies. 


24 Tear Ballot for Industry is available from the Industrial Psychology Laboratory, 
nois Institute of Technology, Chicago 16, Tllinois. 


275 


276 Willard A. Kerr 


‚зна вт TEAR INGE Dest 
THE TEAR BALLOT FOR INDUSTRY, GENERAL OPINIONS 
by Willord 4, Kerr, Ph.D. 
OCAR DNPLOTEC; 15 1s tbe obligation of each of us in this company to try to improve the Вар- 
Diness and теі гаг others. You are one of the large random sample of enployees being a 
ti rate in this sincerely constructive aclentific surv of opinions. Wo one will ever 
ble to connect your name with this ballot. You don't sign your name ~ in fact, you 
are not m required to expose your handwriting on this new type of Opinion ballot! Only a 
sincere and honest expression of your opinion te requested. 
1 Check one answer to eoch question by TEARING THE ARROWHEAD 
1. Does the company make you feel that your job is reasonably secure as long "v 
as you do good work? 
a. Yes, job seems wholly secure 
a. Usually ---------+--- 
3. About half the time - 
з. Rarely -------------- 
5. No, job seems very insecure - и 


2. In your opinion, how does this company compare with others in its interest 
in the welfare of employees? | 
1. It's tops, shows more interest than any other 
a. Slightly above average - 
3. ]t is average - 
ц. Slightly below average ~ - 
Ss Poor, shows less interest than other plans - 


3. How does your immediate superior compare with other managers, foremen, ог 


section leaders as to supervisory ability? 


а. Among the best --------- pE 
a. Slightly above average - КЕ 
з. Average -------------- ШУ; 
ц. Slightly below average - 
S. Among the worst 


ц. Considering your work, are your working conditions comfortable and 
healthful? 


1. Yes, excellent -- - 
2. Slightly above average -- 
з. Average for type of work 
ч. Slightly below average 
5. No, very bad ---------- 


5. Are most of the workers around you the kind who will remember you when you 


pass them on the street? | 
1. Yes, they are very friendly - 
2. Yes, usually -------- 
3+ About half the time - 


ц. Rarely -------------- 
5. No, they аге unfriendly 


6. Do you think your income is adequate for your living needs? 


1. Yes, enough for enough luxuries -- 
2. Slightly above average --------- 

3. Just enough for average comfort 
4. Barely enough to get by on ----- 
s- Much less than enough to get along on ~ 


Copyright 1900 by INDUSTRIAL OPINION INSTITUTE 
AM rights reserved. Printed in 0.5.4. 


‘This scale 18 copyrighted. The reproduction of rt of it , ог in other way, whether 
т өгө ЭПА or аге Гг free for user Mt токула ӨГ the copyright law. For origin and ut 
of the tearing method, see Kerr, У, A., Where Thay Like fo Work, - JOURNAL OF APPLIED PSYCHOLOGY. 1943,27, 


ag 
EE 


Job Satisfaction Tear Ballot 277 


7- 00 you feel that you have proper opportusity to presest а proolen, completet 
or suggestion to the masagenent? 


Yes, always -- 


No, never ---~ 


B. Do you have confidence im the good intentions of the wanagenent? 


Yes, it is sincere -=-= 
Usually -------- 

Half the tine 
Not often ---- 
No, it is insincere -------- 


usuv» 


9. Do you have confidence in the good sense of the management? 


a. Yes, it is capable ам efficient eu 
2. [t is usnally efficient -- 
3. Half the time -— 
а. It is often inefficient - 
5. No. it is stupid and inefficient 


10. What effect is your experience with the company having upon your personal 
happiness? 


Slightly beneficial 
Little or no effect 
Slightly disturbing 
Extremely harmful === нтте 


12.5 Special problems: Please indicate Any or all of the following problems which 
are really sources of frequent annoyance to you: 
i. Inconvenient or undependadle transportation -- 


маою 


„МЕ: Fe con alt Unfairness in promotion policy --------------- = 
ај им Lack of time to take care of personal business 
dust report the Lack of attention to employee recreation ~ = 


Family troubles at home -----------------= 


a 
3 
"m 
5. Broken promises on part of supervisors — 
6 
7. Poor housing conditions or excessive rents -----— 


WE SHALL APPRECIATE YOUR PROPERLY TEARING each of the following tabulation itens 
12. Your sex: 


Won-Office ---—---— 


13. Your present work: 


14. Ате you a supervisor or foreman? Ме агае ЕДЕР Ор 


Day shift ----—----- 
15., Your hovts of work (chiefly) :———Swing shsft ----—— 


| 
— [ee 

-T------ 

Rotation shaft system ------- -——— 


Might shift ------~ 
Ела. 1. Sample of The Tear Ballot for Industry. 


belief has some basis in fact, but the difficulties are not insurmountable, 
Merely numerous; such tests ordinarily require validation from several 

erent angles or viewpoints. Attitude tests, particularly, have so 
Many potential validities—even the same tests applied to the same sample 
—that one must specify “validity of what for what" when discussing at- 
titude test validation. Tt is obvious, on the other hand, that all of the 
Potential validities of any attitude test—or, for that matter, of any test 


278 Willard A. Kerr 


or even any non-psychological measuring instrument—cannot possibi 
ever be determined. The validation reported herewith is one kind о 
validation—a validation provided from a particular viewpoint w 
implies considerable of psychological significance on the potential effi 
ciency of the job satisfaction tear ballot for indicating festering spots 8 
discontent throughout a factory, office, or large retail establishment. 
The rationale, not of the tear ballot itself, but of this partic 
validation, may be stated as follows. To the extent that job adjustmen 
or maladjustment tendencies are persistently present within an employee! 
personality pattern over a period of years, measurement of job sati 
faction on the present job will possess validity for predicting past 
future as well as present job satisfaction. Operationally, it is almo 
impossible to relate (ethically) the anonymously obtained present jo 
satisfaction of a worker with his future job satisfaction. However, it 
possible and practicable to apply this validation approach to the em 
ployee's work history. To pursue this objective a valid measure 0 
past job satisfaction is of course needed. In this study the assumed уай 
measure of past job satisfaction is past tenure (turnover reversed) rate 
which is computed simply by dividing the number of years a worker hai 
been in the civilian labor market by the number of employers worked for 
in the same period of years. While this is not a perfect measure of past 
job satisfaction because of the numerous factors other than persona 
morale which produce individual cases of turnover, it nevertheless is 
highly useful operational criterion in view of the fact that turnover is & 
often the economically wasteful result of being “fed up" with the boss, 
company, work associates, the job itself, etc. 

In other words, among these scoring low in satisfaction with prese 
employment are some workers who have been so dissatisfied in past jobs? 
to have been exceptionally frequent job turnover cases in the past. eo 
retically, therefore, an efficient job morale index should be related to somi 
extent with the past turnover history of а reasonably representativ 
sample of employees. It is apparent that this validity demand is 
conservative rather than an easy one because, first, it asks that а eli 
tionship be found when the criterion admittedly is to some large exten 
a function of irrelevant factors, and, second, it asks that the test ртес“ 
not satisfaction on present job (which, if done, actually is enough) b 
past turnover history as well. 

In order to determine the ability of The Tear Ballot for I ndustry | 
predict past turnover rate of a reasonably random sample of wag 
earners, a supplementary tear ballot was prepared and stapled to the fi OF 
of the regular tear ballot. The supplementary ballot explained that | 
are attempting to poll а random sample of employees in all leading 


Job Satisfaction Tear Ballot 279 


dustries and types of work" and requested the respondent to indicate the 
one of the fifteen industries in which he is employed, “total years of ex- 
perience which you have had in all your civilian jobs together," and 
"during these same total years, how many companies or institutions did 
you work for?" 

А total of four trained interviewers distributed themselves in public 
places over a large southern city and administered anonymously the 
regular and its attached ballot to a total of 98 wage earners distributed 
by industries as follows: agriculture 1, building and construction 11, 
distributing and selling 22, education and research 5, finance and banking 
1, forestry and fishing 1, government 5, manufacturing 24, mining and 
minerals 2, printing and publishing 2, recreation 1, service occupations 5, 
telephone and telegraph 7, transportation 9. 


Table 1 
Pearsonian Coefficients of Correlation between Tenure Rate and Job Satisfaction 
Items in The Tear Ballot for Industry 
Note: Coefficients significant at 1% level are set in bold face type. Remaining 
ones significant at 10% level. 


1. Does the company make you feel that your job is reasonably secure as long 
as you do good work? 45 

2. In your opinion, how does this company compare with others in its interest 
in the welfare of employee? , 

3. How does your immediate superior compare with other managers, foremen, 
or section leaders as to supervisory ability? 

4. Considering your work, are your working conditions comfortable and 
healthful? oe 

5. Are most of the workers around you the kind who still remember you when 


you pass them on the street? 5 p 
6. Do you think your income is adequate for your living needs? 5e 
7. Do you feel that you have proper opportunity to present а problem, com- 28 


plaint, or suggestion to the management? 
8. Do you have confidence in the good intentions of the management? 33 
9. Do you have confidence in the good sense of the management? 
10. What effect is your experience with the company having upon your personal 
happiness? 
MÀ E -————-—-—-+--—-— 
Total satisfaction score, unweighted p 
Total satisfaction score, weighted sr 


А job tenure rate (reverse of job turnover) was then computed for 
each wage-earner by dividing his total time in the civilian labor market 
by the number of jobs held during the same period of time. This job 
tenure rate was utilized as an independent criterion against which to cor- 
relate the individual job satisfaction items of The Tear Ballot for Industry. 


280 Willard A. Kerr 


As indicated in Table 1 the past tenure item validity coefficients range (то 
.14 to .63 among the ten principal items. The responses of Item 11 (th 
item included in the test for additional diagnostic use) correlated fro 
.06 to .60 with the criterion, but these correlations are lacking in exp 
consistency, indicating that the score on Item 11 should not be ine 
in the total score although a tabulation of replies to Item 11 does hay 
diagnostie usefulness. Inspection of Table 1 reveals that seven of thi 
ten principal items correlate significantly, at the one per cent level 
confidence, with the criterion, while the remaining items are all si 
cant at the ten per cent level or better. The unweighted total score corre: 
lates significantly at the ten per cent level of confidence, but when th 
testitems are weighted for the specific purpose of predicting past 
rate, the total score is then found to yield a validity coefficient of .36 
the criterion. While not high this coefficient is approximately the = 
as usually is obtained, for example, between the better industrial 
test batteries and success on the job in near-point acuity operations. Of 
course, verification of these findings should be made with additional 
groups of workers before definite conclusions are warranted. T 

The highest coefficient in Table 1 is that between tenure rate and 
opinion of work associates, suggesting the great importance of the pre 
or absence of strong inter-personal ties in preventing or causing emplo; 
separation. In this particular validation, separation history seems no 
bly more related with that type of emotional security associated wi 
being surrounded with friends than with either job security feeling 
evaluation of adequacy of present wages. This finding suggests that 
ability to make and retain friendships is the stable personality fac 
(longitudinal) involved most in job adjustment. Probably the relati 
negligible correlation between separation history and opinion of pn 
Supervisor is obtained because this other aspect of emotional sec 
is generally more impersonal, formal, and less intimate than is 
typical employee “pals” relationship. 


Reliability 

While an industrial morale test is intended, because of its anonymous 
nature, only for departmental or factory rather than individual diagnosti 
and prediction—and thus theoretically does not require the level of ¢ 
sistency of measurement desired in instruments for individual predic 
—the accumulated evidence on the reliability of The Tear Ballot 
Industry indicates it to be as reliable as many psychological tests whi 
have been found to be of value for individual prediction. ] 
Split-half reliability coefficients corrected by the Spearman-Brow! 
prophecy formula, have been computed on eight different employe 


Job Satisfaction Tear Ballot 281 


groups. These coefficients, ranging from .65 to .82, follow (preceded by 
identification and number of cases): 84 ship carpenters .82; 20 male retail 
supervisors .68; 13 female retail supervisors .80; 7 male retail office em- 
ployees .65; 70 female retail office employees .80; 58 male retail clerks 
.76; 86 female retail clerks .68; 125 female operators in a shirt factory 
.73. The median coefficient in this accumulated experience is .75 which 
suggests a satisfactory level of internal consistency for group diagnosis 
and prediction. 
Summary 


1. A brief measure of job morale, The Tear Ballot for Industry, util- 
izing the tear method of response, was constructed at low vocabulary 
level and administered to various business and industrial groups. 

2. The probability hypothesis was advanced that for certain person- 
ality determinants of job morale, the present is psychometrically an aver- 
age of recent past and near future; that some of the variance in job morale 
is a function of these relatively stable personality characteristics; and 
that for these reasons а valid test of job morale will correlate significantly 
with past turnover rates of wage earners. 

3. This validity hypothesis is tested in a random sample survey of 
98 wage earners in a large southern city. A reversed turnover rate was 
obtained on each respondent along with the anonymous Tear Ballot. 
This tenure-rate criterion was found to correlate significantly at the 107% 
level of confidence with each of the items of the test, even though much 
of the variance of the criterion is due to irrelevant and uncontrollable 
factors. The instrument successfully meets this test of validity. — 

4. In determination of test reliability on administration to eight 
different business and industrial groups, a median coefficient of relia- 
bility of .75 is obtained. 


Received October 21, 1947. 
‚ References 
1. Kerr, W. А. Where they like to work. J. appl. Psychol, 1943, 27, 438-42. — 
2. Remmers, H. H., and Adkins, R. М. Reliability of multiple choice measuring 
instruments, a function of the Spearman-Brown Prophecy Formula. J. educ. 


Psychol., 1942, 33, 385-390. т В 
3. Thorndik e,E.L. The teachers’ word book. New York: Columbia University, B 


of Publications, 1921, revised 1927, pp. 134. 


А Proposed Short Form of the Kuder Preference Record * 


Ray W. Miles 
Louisiana State University, Baton Rouge, La. 


Anyone who has used the Kuder Preference Record! very extens 
has found occasions when he wondered if some shorter procedure 
be found to give a systematic record of one's vocational interests. 
item in the Preference Record consists of three activities from which 
subject must choose one he likes most and one he likes least. 
comparisons are necessary for each item: the first activity with the весој 
the first with the third, and the second with the third. There are four 
teen items to the page and twelve pages in the booklet. Thus the subj 
must make forty-two choices for each page or 504 choices in completing 
the Preference Record. The average college student completes the 
in forty or forty-five minutes but those with less education, who read 
and think at a slower rate, often take much longer. Occasionally а 
counselor wishes a record of the vocational interests of a subject with 
very limited education who takes an undue amount of time to make his | 
choices. It is obvious that one could complete three pages in about one- 
fourth of the time required for twelve pages. 

Examination of the Preference Record and of the answer pad indicates. 
that certain pages might give an expression of one’s interests, and that the 


process of counting it was found that the ratios between possible sco; 
on the entire record and partial scores one might possibly make on th 
three pages corresponded rather closely to ratios already found by 
hnius - thirty-five completed answer pads. Possible scores ano 
suggested weights are shown in Table 1. It seems probable that а sub- 

* The assistance of Dr. Howard Turner of Southwestern Louisiana Institute, 
yette, Louisana in the completion of this study is gratefully acknowledged. 


1G. Frederic Kuder, Preference Record, Chi Ў i visi 
Research Associates, 1 9i , Chicago: Test Service Division, 


282 


Kuder Preference Record 283 


ject might answer the items on pages seven, eight, and nine and that his 
partial scores might then be weighted to give approximately the same 
scores he would have made had he answered all twelve pages. Weights 
proposed for these partial scores, shown in Table 1, are as follows: 
Mechanical, 5; Computational, 3; Scientific, 5.5; Persuasive, 3.5; Art- 
istic, 3.5; Literary, 4.5; Musical, 5.5; Social Service, 3.5; and Clerical, 
4. It was found that in nearly every case the total actual score was not 
greatly different from the product obtained by multiplying the partial 
score by the weight suggested here. 


Table 1 
Data from Which Weights were Derived 
Total Possible Weight Given 
Possible Score, to Partial 
Key Score p. 7-8-9 Score 
090 З E 

1. Mechanical 192 39 5 

2. Computational 11 36 3 

3. Scientific 168 30 5.5 

4. Persuasive 210 60 3.5 

5. Artistic 158 42 3.5 

6. Literary 159 36 45 

7. Musical 69 12 55 

8. Social Service 206 60 3.5 

9. Clerical 177 45 4 


Answer pads completed by 205 men representing an educational 
range from the fourth grade to the Bachelor's degree were taken for a 
more careful examination, to find the reliability of the scores that would 
have been obtained by using only pages seven through nine and weighting 
these partial scores as suggested above. It was found that in keys 1, 2, 
3, and 5, the actual scores were slightly higher on the average than the 
Weighted scores. In keys 4, 7, 8, and 9, actual scores were slightly 
lower on the average than weighted scores. In key 6, however, weighted 
Scores tended to be considerably higher than actual scores; there was а 
tendency for this to be true with the higher scores more often than with 
the scores that were about average or below. Comparisons of actual 
Scores and weighted scores are shown in Table 2. и 

It is worthy of note that weighted scores were more variable than 
actual scores, There was a tendency for one who made a low actual 
Score on a key to make a relatively lower partial score on the three pages 
Studied. Also, those who made a higher actual score on a key tended to 
get still higher scores on that key when their partial scores were weighted. 
However, this is not a disadvantage, for the chief use of the Preference 


284 Ray W. Miles 


Table 2 
Means and Standard Deviations, Actual Scores and Weighted Scores, 
Based on 205 Cases 
Actual Scores Weighted Scores 
———— ——- — Mean 

Key Mean Sigma Mean Sigma Difference 
1. Mechanical 83.60 20.80 80.80 22.10 2.8 
2. Computational 35.85 8.55 32.65 10.20 3.2 
3. Scientific 60.00 11.35 59.60 13.35 A 
4. Persuasive 74.70 14.70 75.50 16.80 - 8 
5. Artistic 49.60 13.50 47.80 16.00 1.8 
6. Literary ~ 43.50 11.80 49.75 17.30 — 6.25 
7. Musical 17.52 9.15 19.30 11.00 —1.78 
8. Social Service 74.40 14.70 76.10 19.00 —1.70 
9. Clerical 56.80 13.80 57.70 16.80 — 4 


Record is to point out preference areas within which one should find 
occupations to investigate. The counselor’s chief interest is to find in 
which preference areas the subject has a greater than average degree of 
interest. This purpose can be served if the subject answers only pages 
seven through nine and if the partial scores are weighted as suggested 
above and percentiles found for the weighted scores. 

Coefficients of correlation between the actual score and weighted 
Scores for 205 cases are shown in Table 3. These seem sufficiently high 
to justify the use of only three pages of the Preference Record with those 
very slow readers who would have difficulty in completing the whole 
record, : There might also be justification for using a short form of three 
pages with a larger group when the time for testing is limited. 


Table 3 


Coefficients of Correlation Found Between Actual Scores and Weighted Scores for 
205 Men]Ranging from Fourth Grade Education to College Graduates 
ooo 


Key Correlation 
1. Mechanical .86 
2. Computational 91 
3. Scientific 76 
4. Регвцавіуе .85 
5. Artistic 85 
6. Literary 91 
7. Musieal .89 
8. Social Service .86 
9. Clerical .84 


SS M MM О ЗР | а 


Kuder Preference Record 285 


Summary 


1. Three pages of the Preference Record can be taken in one-fourth of 
the time required for the whole record. ^ 

2. Pages seven, eight and nine of the Preference Record will yield 
partial scores that are indicative of the total scores a subject would make 
if he completed the record. 

3. Partial scores obtained сап be weighted to give scores approxi- 
mately the same as total scores. Weights for the various keys are: 
Mechanical, 5; Computational, 3; Scientific, 5.5; Persuasive, 3.5; Ar- 
tistic, 3.5; Literary, 4.5; Musical, 5.5; Social Service, 3.5; and Clerical, 4. 
Percentiles may then be found, using the weighted scores., 

4. Coefficients of correlation between actual scores and weighted 
scores for 205 unselected men were found to range from .76 to .91 for the 
different serving keys. ў 

5. The use of this short form can effect а considerable saving in time. 


Received November 3, 1947. 


Methods for Determining Patterns of Leadership Behavior in 
Relation to Organization Structure and Objectives * 


Ralph M. Stogdill and Carroll L. Shartle 
The Ohio State University 


The Personnel Research Board of Ohio State University has under- 
taken a series of studies under the title “Leadership in а Democracy.” 
One phase of these studies includes an investigation of executive positions 
and organization structures in industrial, military, educational, and civil- 
ian governmental organizations. The aims of this research are to de- 
velop improved methodology for studying leadership, to establish criteria 
for judging it, and to prepare information and techniques which may be 
useful in selecting and training persons who may occupy leadership 
positions in various types of organization structures. 

The studies are interdisciplinary in character, involving the points 
of view of various sciences, particularly economics, psychology, and 
sociology. ; 

Тће objectives to be accomplished, and the postulates which deter- 
mine the methods employed in this research, have been formulated in 
broad, general terms so as to provide Scope for investigation. А pre- 
liminary survey? of the experimental literature suggests that leadership 
18 not a unitary human trait, but is rather a function of a complex of 
individual, group, and organizational factors in interaction. Leadership 
resides in individuals, but only by virtue of their interaction with other 
persons. Leadership must, therefore, be studied as a relationship be- 
tween persons, and as an aspect of organizational activities, structures, 
and goals. A comprehensive formulation of the problem is required in 
order to take these factors into account. 

The methods being developed for these studies represent a rather 
marked departure from those usually employed for the investigation of 

* This particular study is a cooperativ ibuti J. 8. Navy, Office of 


presented are those of the authors, and shoul ing the endorse- 

tent of ita NO PS rdi should not be regarded as having the en 

Di ‘The Leadership Studies staff includes С. L. Shartle, Professor of Psychology, 

Director; Alvin E. Coons, Assistant Professor of Economics, Melvin Seeman, Instructor 

in Sociology, and Ralph M. Stogdill, Research Associate in Psychology, Associate 
irectors. 


* Stogdill, R. M. Personal factors associated with ip: f the liter- 
ature. J. Psychol., 1948, 25, 35-71. with leadership: a survey o! 
286 


Patterns of Leadership Behavior 287 


problems in leadership. For this reason, it seems desirable to answer at 
the outset certain questions that arise, and to state certain hypotheses 
which determine the design of our methods. The studies are proceeding 
on the assumption that leadership is а process of interaction between 
persons who are participating in goal oriented group activities. Three 
concepts are implied in this assumption which should be made explicit. 
The first is that leadership resides in specific persons. The second is that 
leadership is an aspect of group organization, and the third is that leader- 
ship is concerned with attaining objectives. 

If leadership is concerned with goal oriented group activities, it 
seems appropriate to study those members of an organization who deter- 
mine goals and objectives and who control the means by which these 
goals are attained. It is thus assumed that leadership in some form exists 
in top administrative positions, as well as at other levels in the organi- 
zation. The question as to whether leaders or excutives are being studied 
appears to be a problem at the verbal level only. 

It is assumed that it is proper and feasible to make a study of leader- 
ship in places where leadership would appear to exist and that if a person 
occupies a leadership position he is a fit subject for study. One further 
assumption that has been made is that leadership is related to getting a 
job done and that it is therefore appropriate to study the work patterns 
of the leaders and of the followers and the working relationships among 
the members of the organization. The soundness of these assumptions 
is being tested in the research. а 

The methods and procedures of the leadership studies may be stated in 
general terms as follows: 


1. The first step has been to appraise the literature in the field, to 
formulate hypotheses which seem basic to the problem, and to develop 
methodology for the testing of these assumptions. 

2. The second step is to discover what leaders do. The facts 
concerning leadership activities and organization structures are ob- 
tained primarily by means of the interview, supplemented by direct 
observation, by questionnaires, and by a study of organization manu- 
als and other materials. The initial interview requires approximately 
three hours with each executive. A modified job analysis is made for 
each position. Sociometric methods are applied in analyzing organi- 
zation structures in relation to leadership activities. A number of 
hew techniques are in the process of development. The Navy studies 
here described are primarily in the second stage, which involves the 
accumulation of data and the development and improvement of 
methodology. 


288 Ralph M. Stogdill and Carroll L. Shartle 


3. The third step involves the development of methods for the 
analysis of data, in order to discover the relationships between such 
factors as responsibilities, authority, work patterns, level in the or- 
ganization, compensation, persons with whom most time is spent, 
proportion of time spent in individual effort, methods of getting work 
done, methods of working with staff, type of organization structure 
and objectives of the organization. 

4. The fourth step, as now projected, will be undertaken after 
data have been accumulated from the study of a variety of organiza- 
tions. The ultimate objectives, as previously indicated, are the 
development of criteria for evaluating leadership in various types of 
organizations, and the preparation of information and techniques 
which will be useful in the selection and training of leaders for various 
types of situations. 


While a strong effort has been made to state objectives in terms which 
would not delimit or over-structure the conception of the leadership 
problem, the research procedures have not remained unstructured. In 
fact, the stated objectives can only be accomplished through the careful 
integration of а variety of research approaches including job analysis, 
organizational analysis, the interview, questionnaires, attitude scales, 
sociometrics and other methods and techniques. 

E The methods being developed are designed to study formal organiza- 
tion as а complex of relationships and processes. An attempt is being 
made to analyze as completely as possible the interrelationships which 
determine leadership status. Not all the factors which define leader- 
ship are easy to describe in quantitative terms. However, a strong effort 


is being made to reduce all data to terms which will permit quantitative 
treatment. 


Progress has been made in the quantification of the following variables: 


1. Level in the organization (usuall i executive's 
position on anor AN Leonie e A у defined by location of an exec 
. Responsibility patterns (may be defined by 1 ization manuals, 
or by common understanding in p dis above). pe ү | 
18. Рајт scores (de ned in terms of the number of times an executive 
a Pi os as being one with whom most time is spent in getting work done); 
‘hare As atterns (defined in terms of what the executive actually does and 
e На ods he employs in carrying out his duties). 
NA: — contacts (defined in terms of executive's own estimate of pel 
"i сва spent with persons, as opposed to individual effort). . 
а нт, n a (defined in terms of an executive's estimate of his ow? 
потоци authority status and of the delegation of authority to hi 
7. Methods of working with staff (defined in terms of a rating scale applied 


pA Сиви. statement of his methods for getting the best work out of 


Patterns of Leadership Behavior 289 


Thus far in the Navy project, three staff orgainzations have been 
studied. In the first two, a stratified sample was studied. In the third, 
all commissioned officers were studied. 

In order to illustrate the application of methods, the results obtained 
from the study of a single staff are presented. This is а Naval Command 
staff, the primary mission of which is the coordination of a wide variety 
of administrative activities within the shore establishment. Twenty- 
four top line and staff positions were studied. Six levels in the organi- 
zation structure were represented. No civilian executives were included. 


Working Relationships 

Each officer interviewed was asked to name in rank order the persons 
with whom he spent the most time in the process of getting work done. 
The resulting lists provide a basis for making a sociometrie study of 
working relationships among the various members of the staff. Working 
relationships among the staff members, as revealed by sociometric dia- 
grams, depart rather markedly in some departments from the formal 
organization chart. It is apparent that organization manuals and organi- 
zation charts define responsibilities and lines of authority, rather than the 
informal organization of day-to-day working relationships. Sociometric 
ratings reveal some tendency for a concentration of contacts in those 
officers who are most actively engaged in carrying out the major policies 
of the organization at the time of the study, regardless of their military 
tank or level in the organization structure. However, sociometric ratings 
correlated +.57 with level in the organization scale. The correlation of 
Sociometrie scores with other factors are shown in the course of discussion. 


R A D Index 


It has been postulated, for purposes of this study, that leadership is 
а function of the interrelated patterns of responsibilities of the members 
of the organization. It is assumed that effective leadership exists when 
the members at all levels in the organization are making their maximal 
Contribution in carrying out responsibilities essential to the success of 
the enterprise. "The effective leader would be expected to influence the 
Work patterns of his immediate subordinates to a greater extent than any 
other person in the organization—and this influence would presumably 
tend to enlarge rather than restrict the contribution and participation 
of subordinates, 

In order to test one phase of this hypothesis, three scales were devised 
or the purpose of measuring the estimate of an individual regarding the 
following factors: а, His level of responsibility; b. His level of authority; 
and е, The degree of authority he delegates to his subordinates. 


290 Ralph M. Stogdill and Carroll L. Shartle 


The scores derived from each of these three sets of scales have been 
combined in various ways to determine interrelationships. One of the 
possible combinations of these scores appears to be the following: 


Authority Score 


——————— X Del i у 
Responsibility Score X Delegation Score 


The term RAD Indez, as used in this paper, refers to the above combina- 
tion of scores. 

A correlation of —.36 between RAD Index and sociometric ratings 
indicates that for this particular staff there is a slight relationship be- 
tween an individual's estimate as to his responsibility-authority-delega- 
tion status and the extent to which he is contacted by other staff members 
in getting work done. Due to the method of scaling the items, a low 
RAD Indez score was associated with high leadership status. There 
was a correlation of —.57 between RAD Indez and level in the organiza- 
tion structure. Per cent of time spent in contacts with persons was 
correlated —.40 with RAD Indez. 

RAD Index scores appear to give some indication of the relation of an 
individual's estimate regarding his status in the organization to his wil- 
lingness to provide his assistants with adequate scope for carrying out 
their responsibilities. It appears that the capacity of an individual to 
provide his subordinates with adequate scope for action may be con- 
ditioned to a considerable degree by the freedom or constraint he feels in 
discharging his own responsibilities. 


Work Patterns 


In accord with the hypothesis that leadership is a function of the in- 
terrelated patterns of responsibilities of the members of an organization, 
one would expect to find that work patterns are related to sociometrie 
ratings and other factors. This was found to be the case. 

When sociometric ratings were plotted against percentage of time 
Spent in major administrative functions, the highest correlations were 
found to be with planning and coordination (+.49 and +.46 respectively). 
Such functions as research, inspection and public relations showed a low 
negative correlation with sociometrie ratings. 

Planning, coordination, and the preparation of procedures for the 
carrying out of plans were also positively correlated with per cent of time 
spent in contacts with persons. The correlation was +.39 for planning, 
and +.30 for coordination, and +.26 for the preparation of procedures. 
As would be expected, research was negatively correlated with per cent 
of time spent with persons. The correlation coefficient was — 43. 


Patterns of Leadership Behavior 291 


D Index was found to be correlated —.60 with inspection, —.43 
g, and —.34 with coordination. 
d has been developed for determining the degree of similarity 
the work patterns of the members of an organization, and of 
on a chart, resembling a sociometric diagram, those clusters of 
whose work patterns are most individualistic. Some persons 
clusters in the sociometrie diagrams are also clustered in the 
patterns diagrams. 
he analyses that have been made thus far suggest that the work 
ns of executives differ not only with such factors as level in the 
n, but with departmental function and mission, as well as 
ges in the objectives and activities of the organization. Further 
be required in order to determine what the specific patterns are 
s types of executive positions in the lower levels. An attempt 
made to determine whether any uniformities exist in similar 
positions in different organizations. An effort is being made to 
è, as a specific example, whether a position demanding a high 
of planning and coordination can be filled adequately by an 
whose usual pattern of work is heavily loaded with public re- 
or perhaps research or supervision, or whether the position can 
handled by a person who has already acquired a planning-coord- 
pattern of work. 
our possible criteria of leadership status have been discovered which 
that planning and coordination are primary functions of top 
in the staff under discussion. These four lines of evidence are 
metric ratings, level in the organization structure, per cent of time 
in contacts with persons, and RAD Indez. As would be expected, 
tent to which these possible criteria are correlated with each other 
th other factors varies considerably from one staff to another. 
ts at the preliminary stages of the study suggest that a number 
methods employed hold promise for further development and 
vement. 


Adult Leadership Scales Based on the Bernreuter 
Personality Inventory * 


Helen M. Richardson 
New Jersey College for Women, Rutgers University 


In an earlier paper Hanawalt and Richardson (3) reported an analysis 
of items in the Bernreuter Personality Inventory to which the responses 
of adult men who were leaders in vocational and social activities were 
significantly different from responses of non-leaders. Responses of 
**Office-Holders" were compared with responses of *Non-Office-Holders," 
and responses of “Supervisors” with those of *Non-Supervisors. Analy- 
sis of the responses suggested the possibility of deriving leadership scales 
by assigning scoring weights to the significant Bernreuter items on the 
basis of the degree to which they elicited different responses from the 
contrasting groups of leaders and non-leaders. The construction of such 
scales is described in the present paper, and data on validity and relia- 
bility are set forth and discussed. 


Subjects 


1 Two main samples of subjects were used in the study: (a) the indi- 
viduals whose responses were employed in determination of the scoring 
Weights to be assigned to the test items, designated as the Item-Weight- 
ing subjects; and (b) the individuals to whom the derived scales were 
applied for testing validity, designated as the Validation subjects. None 
of the Item-Weighting subjects were used in the Validation groups. In 
addition to these two main samples, some subjects independent of both 
gtoups were included in the computations of reliability coefficients and 
of intercorrelations. 

The subjects in all cases were men aged 26 years or over, and were 
obtained chiefly by asking psychology students at New Jersey College 
for Women to request their fathers or other older men of their acquaint- 
ance to fill out the Bernreuter questionnaire anonymously and to furnish 

* Construction and testing of the 23-i in thi or were 
formed by Miss Ethel M. Estoppey and Mise ance Grater бна рар E Hou 
performed these operations for the two lengthened scales, and determined the occ 
pational and age distributions of the subjects in the validation groups. Assistance 18 


scoring and computation was rendered by Miss D i lotte Loss0W» 
and Miss Marcia Swetland, он Ае Pn Caro 


292 


Adult Leadership Scales 293 


additional information as to age, occupation, number of persons under 
their supervision, and offices held since the age of 21 in any organizations 
(professional, business, civie, religious, fraternal, or social). On the basis 
of the supplementary data, respondents were classified into two pairs of 
contrasting groups: Office-Holders vs. Non-Office-Holders, and Super- 
visors vs. Non-Supervisors. Office-Holders were defined as persons who 
reported having held at least two presidencies or apparently important 
chairmanships in organizations; Non-Office-Holders were those who re- 
ported no offices. Supervisors were defined as individuals who stated 
that they had fifteen or more persons under their direction or super- 
- vision, supposedly in an executive capacity. They were contrasted with 
Non-Supervisors, who reported not more than one person under their 
supervision. Respondents who could not be classified in any of these 
four contrasting groups were designated as Non-Contrasting subjects. 
The number of subjects in each group was as follows: 


a 


Office- Non-Office- ` Super- Non-Super- 
visors 


Holders Holders visors 
Item-Weighting 57 70 90 
Validation 48 56 44 н 45 
ТаЫе 1 


Occupational Distribution (Percentages) of the Samplings Used іп Item-Weighting 
and Validation Compared with 1940 Distribution of Employed 
Adult Males in New Jersey 


T m od ceret gm 
Occupational Group New Jersey Groups Groups 
Professional 59 302 27.6 
Semi-professional 1.6 31 45 
Farmers and Farm Managers 18 47 23 
Proprietors, Managers; and Officials, 
except Farm 11.5 26.0 25.4 
Clerical, Sales 17.8 23.2 19.4 
Craftsmen, Foremen 194 70 112 
Operatives 222 31 22 
Domestic Service 04 = с 
ice, except Domestic 7.5 2.7 6.7 
Farm Laborers (paid) and Farm 
Eum zi — = 
orers (unpai ily workers; К ari ve 
Laborers, iri wes E | 9.6 = 0.7 
Occupation not Reported 10 E Res 


—mpation not Reported Ш 


294 Helen M. Richardson 


In the Item-Weighting, groups, 23 of the 57 Office-Holders were also 
Supervisors, 7 were Non-Supervisors and the remaining 27 were non- 
contrasting with respect to supervisorship; 14 of the 70 Non-Office- 
Holders were Supervisors, 22 were Non-Supervisors, and 34 were non- 
contrasting. In the Validation groups, 20 of the 48 Office-Holders were 
Supervisors, 4 were Non-Supervisors, and 24 were non-contrasting with 
regard to supervisorship; 17 of the 56 Non-Office-Holders* were Super- 
visors, 26 were Non-Supervisors, and 13 were non-contrasting. 

* All of the Item-Weighting subjects filled out their questionnaires dur- 
ing the years 1939-1942, when they furnished the basis for the earlier 
item-analysis by Hanawalt and Riehardson (3). Data from the Vali- 
dation groups were obtained in 1943-1944, except that ten members of 
the Validation group of Non-Office-Holders were drawn from a surplus 
of Non-Office-Holders obtained in the earlier years but not included in 
the Item-Weighting groups. 

The occupational distribution of the subjects in the contrasting groups 
is given in Table 1, together with the distribution of employed adult 
males in New Jersey according to the 1940 census (10). The column 
headed “Item-Weighting Groups” is based on a total of 258 respondents! 
in 1939-1942 who were classified in the contrasting groups, and includes 
the surplus Non-Office-Holders referred to above. The column headed 
"Validation Groups" is based on 134 cases! classified in the contrasting 
groups, and includes 10 Non-Office-Holders from the unused 1939-1942 
subjects along with 124 obtained in 1943-1944. This table is not meant 
to imply that our groups should have shown the same distribution as the 
census figures, but is presented merely to indicate the composition of our 
groups and to permit a comparison of the occupational distribution of 
the Item-Weighting and Validation subjects. It is evident that the two 
sets of subjects are very similar in occupational distribution. 

: Table 2, presenting the age distribution of the various classes of sub- 
Jects in the Item-Weighting and Validation groups, shows the change 
that might be expected with a shift to war years from years that were 
largely pre-war. In the Validation groups (1943-1944) the percentages 
in the age group from 26-35 are considerably reduced, especially among 
the Non-Office-Holders and Non-Supervisors, which in the earlier period 
had a large proportion of younger men. In both the Item-Weighting an 
the Validation groups, the percentages of Office-Holders and Supervisors 
1 The totals cited here for th -Weighti idati num- 
ber of diferent individuala serving sa аро Та thea tole а orem with tho dul 
classification of Office-Holder and Supervisor, for example, was counted only onc& 
though he figures in both these columns in the summary. The fact that the two sels 


of figures cannot be made to check by allowing for the duplications, is due to the рге” 
ence of the "surplus Non-Office-Holders" mentioned above. 


Adult Leadership Scales 295 


are greatest in the ages above 45. This may indicate that, to some degree, 
leadership which is merely potential in the earlier years is realized later 
in life. Existence of such latent leadership in the younger groups of 
non-leaders might be expected to have an adverse effect on the validation 
of leadership scales. 


Table 2 . 
Age Distribution (Percentages) of Item-Weighting Subjects 
and of Validation Subjects 
Item-Weighting 
roups Non- 
Office- Office- Super- Non- 
Age Holders Holders visors Supervisors All 
Б 07 0 5| m 
26-35 12.2 41.4 10.0 60.2 32.9 
36-45 19.2 17.2 26.6 10.2 18.9 
46- 68.4 41.4 63.3 29.5 48.1 
Validation 
Groups 
Age 
EE 
26-35 4.2 26.8 11.4 244 18.7 
36-45 25.0 33.9 29.6 311 29.9 
46- 70.8 39.3 59.1 444 51.4 


ee Ec o o oc not oe 


Preliminary Study: 23-Item Scales 


The item-analysis by Hanawalt and Richardson (3) which served as 
the point of departure for the present study revealed 23 items for which 
the chi-square test indicated a significant difference (P value of .05 or less) 
in the distribution of “Yes,” “No,” and '?" responses by Office-Holders 
and Non-Office-Holders, and 23 items for which there was à significant 
difference in the responses of Supervisors and Non-Supervisors. Nine 
items were common to the two lists. For a discussion of these items the 
reader is referred to the earlier paper. In constructing scales for dis- 
tinguishing leaders from non-leaders in the two fields, a preliminary 
trial was essayed in which each scale was made up of the 28 items indi- 
cated above. 

Determination of Item Weights for the 23-Item Scales. Scoring 
Weights were derived by Kelley’s revised formula (4), following the 
method for use with semi-equalized four-fold tables described by Strong 
(7, рр. 611-615), and employed by him in deriving scoring weights for 
his Vocational Interest Tests. Strong and Carter (8) found this method 
slightly superior to Kelley’s original formula, which was used by Strong 
in an earlier scoring of the interest tests, and which was followed by 


296 Helen M. Richardson 


Bernreuter (1). Derivation of the weights was facilitated by the use of 
& chart prepared by Strong (6). This chart provided for a range of 
weights from +4 to —4. Plus values were assigned to responses which 
were more characteristic of leaders, negative values to responses given 
preponderantly by non-leaders. 

When the responses of the 57 Office-Holders and 70 Non-Office- 
Holders in the Item-Weighting group were scored according to the Office- 
Holder scale and means for the two groups were computed, /, based on 
the standard error of the difference between the means, was 10.97. 

Validity and Reliability of the 23-Item Scales. The 23-item scales for 
Office-Holders and for Supervisors respectively were then tested by ap- 
plication to entirely new contrasting groups of 30 Office-Holders vs. 30 
Non-Office-Holders and 31 Supervisors vs. 38 Non-Supervisors. Results 


Table 3 


Validity and Reliability Data from Try-Out of Preliminary 23-Item Scales 
for Office-Holders and Supervisors 


Office-Holder Supervisor 


le Scale _ 
(30 OH; 30 NOH) (31 S; 38 NS) 


trasting Groups 21.5 8.3 
: 3.82 271 
: 5.63 3.06 
Percentage of Overlapping 0.0 39.8 
cM 75 ү 
Reliability Coefficient (split-half after — ' 
application of Spearman-Brown formula) 72 52 


of this try-out are briefly summarized in Table 3. In view of the small 
number of subjects, the 28 indicated fair validity, especially for the 
Office-Holder scale, but the reliability left something to be desired. 

The next procedure was to see whether, without sacrificing validity, 
the reliability could be increased by lengthening the test through in- 


ie Ма Бан stems of substantial, though lower, discriminating 


Construction of a 101-Item Scale for Office-Holders 
and an 84-Item Seale for Supervisors 


Determination of Item Weights. Responses of the Item-Weighting 
groups to the previously unused Bernreuter items were evaluated accord- 
mp Бу 8 revised formula, by referring the comparative percentage? 
of "Yes," “No,” and “?” responses of leaders and non-leaders (Office 


Adult Leadership Scales 297 


"Holders vs. Non-Office-Holders, and Supervisors vs. Non-Supervisors) 
to Strong’s chart, following the procedure by which the scoring weights 
had been derived for the 23-item Office-Holder and Supervisor scales. 
All items which received a weight of 1 or more according to the chart 
were included in the augmented scales. With this more lenient criterion 
for item-inclusion, the Office-Holder scale was expanded to a total of 101 
items, and the Supervisor scale to 84 items, 74 items being common to 
both lists. When the lengthened Office-Holder test was applied to the 
- 57 Office-Holders and 70 Non-Office-Holders in the Item-Weighting group, 
_t, based on the standard error of the difference between the means, was 
found to be 11.08. Comparison of this figure with that obtained from 
the 23-item test (! = 10.97) indicated that validity had not been reduced 
by inclusion of the additional items. 


Table 4 


Comparison of Scores of Office-Holders, Non-Office-Holders, and Non-Contrasting 
Subjects on 101-Item Office-Holder Scale 


ooo 
Percent. 
N Mean SD Mean Diff. SEaw. — Overlap row 


Office-Holders 48 55.1 21.30 

OH-NOH 317 4.59 691 76 7 
Non-Office-Holders 56 23.4 25.14 . 
OH-NC 137 409 335 
Non-Contrasting 
Subjects 96 414 

NC-NOH 180 4.32 417 


Validity. The validity of the 101-item Office-Holder scale and of the 
84-item Supervisor scale was tested by application of the lengthened 
Scales respectively to the Validation groups of 48 Office-Holders vs. 56 
Non-Office-Holders and 44 Supervisors vs. 45 Non-Supervisors. Tables 
4 and 5 give the results in terms of t, percentages of overlapping (per- 
| centage of the non-leaders who exceed the median of the leaders), and 

iserial correlations. Some data are also included for comparing Non- 
Contrasting subjects with the criterion groups on each scale. The Office- 
Holder scale appears, as in the preliminary 23-item tests, to be considera- 
bly more discriminating than the Supervisor scale. In comparing the #8 
- for the shorter and longer forms of each test, one must bear in mind that 
_ the 23-item tests were applied to smaller samples of subjects. If the 

number of Validation subjects in the preliminary try-out is adjusted to 
. qual the number in the later tests of validity, the Ёз for the shorter 
tests are increased to 7.36 for the Office-Holder scale and 3.47 for the 


S 


298 Helen M. Richardson 


Supervisor scale. It still appears, however, that lengthening the tests 
has not significantly lowered the validity. 

It is not surprising that the mean scores for Non-Contrasting subjeets, 
while lying as they should between the means for the contrasting groups, 
are closer to the means for Office-Holders and Supervisors respectively 
than to Non-Office-Holders and Non-Supervisors. Non-Contrasting sub- 
jects were respondents who reported offices or persons under their super- 
vision, but to a lesser extent than the criterion subjects. The ¢ for the 
difference between the means of Non-Contrasting subjects and Non- 
Supervisors is significant at the .02 level, and borders on the .01 level. 


Table 5 


Comparison of Scores of Supervisors, Non-Supervisors, and Non-Contrasting 
Subjects on 84-Item Supervisor Scale 


Percent. 
N Mean 8р Mean Diff. SEa. t Overlap љи 


Supervisors 44 32.7 16.46 


SNS 153 384 398 197 49 
Non-Supervisors 45 174 19,23 
SNC 51 349 146 
Non-Contrasting 
Subjects 36 276 1432 


NO-NS 102 378 270 


Reliability. For computing the reliability of the lengthened scales, 
the four Validation groups were combined in a single list of 134 subjects. 
In order to avoid the possibility of spuriously increasing the reliability 
coefficients by influence of the contrasting groups, 96 Non-Contrasting 
subjects were added to the list for computing reliability of the Office- 
Holder scale. Since the reliability coefficient for this list of 230 cases 
actually proved to be greater (by .01) than the coefficient for the 134 
subjects, it was considered unnecessary to score all the Non-Contrasting 
questionnaires on the Supervisor scale. The coefficients of reliability 
were computed by using the split-half method (Odd-Even) and applying 
the Spearman-Brown prophecy formula. For the Office-Holder seale 
(N = 230) the reliability coefficient is .81; for the Supervisor scale 
(М = 170) the coefficient is .69. 

Intercorrelations. A few intercorrelations are reported in Table 6. 
Correlations with Bernreuter scores have been computed only for the 
Office-Holder scale, since the reliability and validity of the Supervisor 
scale did not seem to justify the expenditure of time in scoring, tabulation, 
and computation. For correlation ‘with the Office-Holder scores, F1-0 


Adult Leadership Scales 299 


was selected because this scale gave the most significant difference (crit- 
cal ratio = 3.97) between Office-Holders and Non-Office-Holders in the 
previous study by Richardson and Hanawalt (5). B1-N and B3-I 
yielded critical ratios almost as great as F1-C, but have not been in- 
cluded in the present intercorrelations because of their high correlation 
with F1-C. B4-D was selected for correlation computation because 
along with its value in discriminating between Office-Holders and Non- 
Office-Holders (critical ratio = 3.69) it carries a lower correlation than 
B1-N and B3-I with F1-C. 


Table 6 
Correlation of Office-Holder Scores with Other Scales 
а ССр sus o 
N r 

E 2 7 зен ин ни 
Office-Holder Scale and Supervisor Scale 170 .82 
Office-Holder and Fl-C 69 —.83 
Office-Holder and B4-D 69 75 


Discussion 


When the correlation between the Office-Holder scale and F1-C was 
found to be slightly greater than the reliability coefficient of the former 
scale, the first thought that came to mind was that of a certain king who 
"with ten thousand men, marched up the hill and down again." Later 
reflection, however, suggested that mountains are frequently ascended 
for no other purpose than exploration, surveying, and mapping. Even 
if the conclusion is that F1-C (or BI-N) would serve the purpose of dis- 
criminating Office-Holders about as well as our Office-Holder scale, some 
exploratory value may be found in the study which we have carried 
through. 

In the first place, the possibility of constructing such leadership scales 
from the Bernreuter items was definitely an open question, suggested 
by the previous item-analysis by Hanawalt and Richardson (3). As a 
matter of fact, the Office-Holder scale and the much less discriminating 
Supervisor scale actually have fulfilled our expectation that they would 
distinguish Office-Holders from Non-Office-Holders and Supervisors from 
Non-Supervisors better than any of the Bernreuter scales. The Office- 
Holder scale is considerably superior to any Bernreuter scale in this re- 
Spect. For convenience in comparison, Tables 7 and 8 give the critical 
ratios (mean difference/SEu;«.) from Richardson and Hanawalt’s earlier 
study (5) of the Bernreuter scales, along with the /'s from our Office- 

older and Supervisor scales. Reliability coefficients from the Bern- 
reuter Manual (2) and from the present study are also included. In 


% 


300 Helen M. Richardson 


reliability, the Office-Holder scale is not far behind F1-C, and it is much 
more discriminating. At this point attention may be called again to the 
fact that the Validation groups of 48 Office-Holders, 56 Non-Office- 
Holders, 44 Supervisors, and 45 Non-Supervisors are entirely independent 
of the Item-Weighting groups from which the scale weights were derived. 
Application of the Office-Holder scale to the Item-Weighting groups of 
Office-Holders and Non-Office-Holders yielded a ¢ of 11.08. 


Table 7 


Office-Holders vs. Non-Office-Holders: Critical Ratios (Mean Difference/SEait.) 
on Bernreuter and Office-Holder Scales. Reliability 


Coefficients of these Scales 
N DENM x N n Cost 
et tio eliabilit; о 

Scale OH NOH ort Study d Reliability 

EEUU НА NT. M" CUM 
ВІ-Х 57 116 — 3.91 128 88 
B2S 57 116 —0.18 128 85 
B3-I 57 116 —3.81 128 85 
B4-D 57 116 3.69 128 88 
F1-C 57 116 —3.97 100 86 
Е2-8 57 116 —2.33 100 T8 
Offie-Holder 48 56 6.91 230 8: 

Table 8 


Supervisors vs. Non-Supervisors: Critical Ratios (Mean Difference/SEaut.) 
on Bernreuter and Supervisor Scales. Reliability 
Coefficients of these Scales 


N Critical N in Coefficient 
tii iabilit; ој, 

Scale 8 NS AT Кеану Reliability ay 
Re 90 88 1.90 
pe NU m -211 

A 80 88 2.72 See Table 7 
ЕС 90 88 —348 
Eis: : 90 8 1.97 

а ү se 3.98 170 69 


I MM MEE, © — "E 

Р Тһе present, study agrees with the earlier work by Hanawalt and 
Richardson (3) in showing throughout that the Office-Holders differ from 
the Non-Office-Holders more than the Supervisors differ from the Non- 


Supervisors. A larger number of discriminating items were found for the 
Office-Holder scale. The indices of validity (t, biserial r, and percentas? 


Adult Leadership Scales 301 


of overlapping) show greater differentiation between Office-Holders and 
Non-Office-Holders than between Supervisors and Non-Supervisors. 
Supervisorship may actually be less tied up than office-holding with the 
personality pattern, and more subject to factors outside the person, or 
at least to factors outside the Bernreuter Inventory items. We cannot 
claim to have developed a highly reliable and valid device for selecting 
supervisors. We have shown, however, what can be done by way of 
developing from the Bernreuter Inventory a scale based on the items 
which best distinguish a group of Supervisors from Non-Supervisors. 
A study based on a group of supervisors of proved excellence might be 
more rewarding. Uhrbrock and Richardson (9) have reported a Test 
for the Selection of Supervisors in which they obtained a correlation of 
71 + .03 between score on 85 significant test items and a criterion score 
based on ratings of supervisors by superintendents. These investigators 
included parts of the Thurstone Personality Schedule among the 820 
psychological and interest items which they tried out, but they do not 
indicate how useful they found the Thurstone items, which most closely 
resemble the material used in the present study. 

Objections to such questionnaires as the Bernreuter or ours for the 
purpose of candidate selection are too well-known to be repeated here. 
On the other hand, the data at least from our Office-Holder test are 
evidence of а certain validity in the method when responses are made 
seriously and in good faith by persons who have nothing to gain from 
making a certain showing. The Office-Holder scale might be useful in 
counseling a person who sought an appraisal of his attributes. 

Resemblances and differences between the characteristics of Super- 
Visors and of Office-Holders are indicated by correspondence and differ- 
ence in the discriminating items for our two scales (74 out of the 84 
Supervisor items being significant also for Office-Holders) and have been 
discussed at length in the earlier paper by Hanawalt and Richardson (3). 
Item-analysis indicates that both types of leaders are characterized by 
dominance and good adjustment, but that Supervisors tend to be more 
self-sufficient than Office-Holders. A measure of community between 
the characteristics of the two groups is given in the correlation of .82 be- 
tween our Office-Holder and Supervisor scales. i j 

The possibility of increasing the reliability of a test by adding to its 
length has been demonstrated by the difference in the reliability coeffi- 
cients for our shorter and longer tests. That the increase fell short of 
What would have been predicted from the Spearman-Brown prophecy 
formula need not be taken to discredit the formula. The items added to 
lengthen the scales were not strictly comparable to the original items, 
being of lower discriminating value. If the additional items had met 


302 Helen M. Richardson 


this criterion, the Office-Holder test, expanded to 4.4 times its origi 
length, should have increased its reliability from .72 to .92 instead: 
merely to .81; and the Supervisor test, augmented to 3.7 times as m 
items as the number in the shorter form, should have had its reli 
raised from .52 to .80 instead of to .69. 


Summary 


activities were significantly different from responses of non-leaden 
ltem weights for an Office-Holder scale were obtained from res 
of 57 Office-Holders and 70 Non-Office-Holders; weights for a Supervist 
scale were derived from responses of 90 Supervisors and 88 Non-Supel 
visors. 
2. Brief scales based respectively on the 23 items found by chi- 
square test to yield P values of .05 or less in an earlier study of the ваш 
groups (3) were applied to groups of new subjects. The 23-item Of 
Holder scale yielded a ¢ of 5.63 for the difference between the means 
30 Office-Holders and 30 Non-Office-Holders, with a reliability coeffici 
of .72. For the 23-item Supervisor scale, applied to 31 Supervisors 
: 38 Non-Supervisors, # was 3.06 and the reliability coefficient was 
Biserial r’s and percentages of overlapping are also reported for each scale. 
Р 3. By including all items which received а weight of 1 or more ace 
ing to Strong's chart, the Office-Holder scale was lengthened to 101 ite 
and the Supervisor scale to 84 items. Validity and reliability were tes 
by application of the scales to groups of subjects entirely separate fron 
those used in the item-weighting, but including the cases used in tes 
the 23-item scales. For the Office-Holder scale, t (based on means of 
Office-Holders and 56 Non-Office-Holders) was 6.91; biserial r was 
percentage of overlapping 7.6; reliability coefficient .81. For the Su 
visor scale, t (44 Supervisors, 45 Non-Supervisors) was 3.98; rois 497 
centage of overlapping 19.7; reliability coefficient .69. Lengthening 


seales was thus found to have increased their reliability without les 
their validity, 


ee ш useful for guidance though not for candidate selection. It 
criminates between Office-Holders and Non-Office-Holders consid 
better than any of the Bernreuter scales. The correlation between ih 


Adult Leadership Scales 303 


Office-Holder scale and F1-C, however, is slightly greater than the re- 
liability coefficient of the former. 


Received October 2, 1947. 
References 


L Bernreuter, R. О. The theory and construction of the Personality Inventory. 
J. soc. Psychol., 1933, 4, 387-405. 

2. Bernreuter, R. G. Manual for the personality inventory. Stanford Univ.: Stanford 
Univ. Press, 1985. Pp. 6. 

3. Hanawalt, N. G., and Richardson, H. M. Leadership as related to the Bernreuter 
personality measures: IV. An item analysis of responses of adult leaders and 
non-leaders. J. appl. Psychol., 1944, 28, 397-411. 

4. Kelley, T. L. The scoring of alternative responses with reference to some criterion. 
J. educ. Psychol., 1934, 25, 504-510. 

5. Richardson, H. M., and Hanawalt, N. G. Leadership as related to the Bernreuter 
personality measures: III. Leadership among adult men in vocational and social 
activities. J. appl. Psychol., 1944, 28, 308-317. 

6. Strong, E. К., Jr. Chart for the computation of weights for interest test items. 
Stanford Univ., 1985. (Photostatic copy available from author.) 

7. Strong, E. K., Jr. Vocational interests of men and women. Stanford Univ.: Stan- 
ford Univ. Press, 1943. Pp. xxix + 746. 

8. Strong, E. K., Jr., and Carter, H. D. Efficiency plus economy їп scoring an interest 
test. J. educ. Psychol., 1935, 26, 579-586. ^ 

9. Uhrbrock, R. S., and Richardson, M. W. Item analysis: the basis for constructing 
а test for forecasting supervisory ability. Person. J., 1933, 12, 141-154. 

10. U. S. Bureau of the Census: Sizteenth Census of the United States, 1940. Population: 
Characteristics of the population of New Jersey, 2nd Series, Washington: U. 8. 
Gov. Printing Office, 1942. 


Identification of Cola Beverages, I. First Study. 


N. H. Pronko and J. W. Bowles, Jr. 
University of Wichita 


When subjects are asked to taste and identify four samples of cola 
beverages, only three of which are generally known, what pattern of iden- 
tifications will be observed in the situation? Will the Ss apply four 
different naming categories or will they tend to repeat one of the pre- 
viously used brand names? What relationships will appear between (а) 
the Ss’ judgments when given four different cola drinks t^ identify as 
compared with (b) Ss' judgments when the same cola drink is presented 
four times? If (a) and (b) should be significantly different, then appar- 
ently S is making a taste discrimination on the basis of the actual chemical 
and physical properties of the stimulus objects. On the other hand, if 
(a) and (b) should be essentially similar in their patterning, then Ss’ 
judgments must be explained otherwise. The present experiment was 
designed to yield an answer to these related problems. 


Procedure 


The subjects of the present investigation consisted of 168 college 
students, for the most part members of the General Psychology courses. 
There were 117 males and 51 females. 

Part 1 - Each of 108 Ss was admitted singly into the experimental room 
and was invited to sit down at a small table containing a tray with four 
l oz. samples of cola beverages. The following instructions were then 
read to him. 


d would like to have you taste and identify some cola drinks. You will 

i din what order and when you are to drink them. After you have finished 
beh unir aah your identification to Ё and, by referring to the scale placed 
judgment. іпсісаќе the particular degree of certainty that applies to eac 
jurigment._ (This printed scale placed in front of the subject employs the 
MAN ш: steps: (1) very certain; (2) moderately certain; (3) mo erately 
ч z n (4) very uncertain.) After each stimulus presentation, take enoug 
ater Irom the paper cup before you to rinse your mouth well. 


From the tray placed before him, S picked up and drank in a certain 
order the contents of a one oz. glass labelled w, x, y, and z symbolizing; 
respectively, Coca Cola, Pepsi Cola, RC (Royal Crown) Cola, and Vess 
Cola. After each drink the S's identification and degree of certainty of 
judgment were immediately recorded. His name and other pertinent 


304 


Identification ој Cola Beverages, I 305 


information were also obtained and recorded between drinks, which were 
spaced approximately a minute apart. At the conclusion of the experi- 
ment, S was asked to say nothing about the experiment except that it 
was a test of taste discrimination, although it should be emphasized that 
at no time could S see any part of the preparation of the beverages or get 
any cues that might indicate the nature of the experiment. At all times, 
all bottles, ete. were kept out of sight behind screens. The beverages 
were kept iced until used so that whatever temperature variation may 
have obtained was slight and was constant for all four beverages. The 
order of presentation of the four beverages, pre-determined, was such 
that each of the four stimuli appeared four times in each of the first, 
second, third and fourth positions, foreward and backward. This 
counter-balanced order was used to preclude the operation of effects re- 
sulting from position of stimuli or from stimulus interactions in the mouth. 

Part II. In Part II, 60 Ss were administered the same cola drink at 
each of the four trials, each group being evenly divided with respect to 
the four cola brands, Thus, 15 Ss were given all Coca Cola; 15, all Pepsi 
Cola; 15, all RC Cola; and 15, all Vess Cola. In all other respects, the 
procedure was the same as in Part I. 


Results and Discussion 


Inspection of Table 1 will show that, primarily, our group of 108 Ss 
used three categories of identification or naming response with a slight 
sprinkling of less well-known or less advertised product names. When 
all four brands are considered, the total Coca Cola, Pepsi Cola, RC Cola 
and Vess Cola judgments employed are respectively 132, 147, 112, and 4. 


Table 1 


Showing the Distribution of 432 Identification Responses When Each of the 108 Ss 
was Presented in Turn, but in Counter-balanced Order, with a 1 oz. 
Sample of Coca Cola, Pepsi Cola, RC Cola and Vess Cola | 
Note: In this part of the experiment, Part I, each S was given four different brands 
of cola drink. 


› Various Identification Responses 
Brand of Frequency of Ss’ Vari 


er Dr. 
Given S С.С. Pep. В.С. Ves Cleo Rock Pep. Other D.K. Total 


Coca Coa 46 34 И а" o 5. ЖЕ 
Pepsi Cola — '25 50 23 1 St 4 1 108 
RC Cola 2. ОЭ 1o 108 
VessCola 29 34 pe ee з 9 5 О 
Total по ит ППИ tl re, оз ee C 


306 М. Н. Pronko and Ј. W. Bowles, Jr. 


Table 1a 

Identification of Cola Beverages by 108 Ss When Each S was Presented a 

Sample of Each of Four Brands t 
Brand of Cola Presented a 
Coca Cola Pepsi Cola RC Cola Vess Cola Го 

Identifi- __———— = та Ix CPU, = 4 
cation No % Ко. % No. % Хо. % Хо. | 

Correct. 46 43 50 46 39 36 1 1 


Incorrect 62 57 568 и 69 64 107 99 
"Totals 108 100 108 100 108 100 108 100 


аз Coca Cola 29 times, as Pepsi Cola 34 times, and as RC Cola 29 ti 
Why this particular pattern of namings? Я 

The data for the other brands presented are also significant. No 
that RC Cola is correctly named 39 times, yet is misidentified as Co 
Cola 32 times and as Pepsi Cola 29 times. Pepsi Cola is “correctly ide 


with Coca Cola and RC Cola combined. Although Coca Cola is © 
rectly identified" as Coca Cola 46 times, it is misidentified as Pepsi ( 
or RC Cola a total of 55 times. 

From a slightly different approach Table 1a compares percentages í 


Table 2 у 
Showing the Distribution of 240 Identification Responses When Each of 60 58 was 
Presented with Four One-oz, Glasses of Either Coca Cola, Д 

Pepsi Соја, Royal Crown or Vess Соја у 


of eect n of the experiment, Part П, each subject received four s 


4 


Brand ‘ Frequency of Identification Responses 


Beverage Dr. 
Presented ОО. Pepsi ЕС. Vess Cleo Rock Pop. Other DK. 


Coca Cola 25 21 6 
Pepsi Cola 27 21 10 
RC Cola 19 20 14 
Vess Cola 18 18 16 


Totals 80 80. 46 


о сомом 
л owon 
о oooo 
о ‚косо 
e NRF OF 
о Noon 


Identification of Cola Beverages, I 307 


correct with incorrect identifications for each of the four beverages. In 
terms of total identifications, it will be noted that only 31% are correct 
while 69% are misidentified. 

Results for our second group of 60 Ss in Part II, where each group of 
15 Ss was given four samples of the same cola beverage, are not essentially 
different from the picture obtained from the 108 Ss of Part I that got four 
different drinks. The total number of Coca Cola responses is 89; Pepsi, 
80; RC, 46; Vess, 8 and a scattering of others as shown in Table 2. Note 
that although 25 identification responses for Coca Cola are correct, never- 
theless, this same label is misapplied almost as many times (19) incor- 
rectly to RC and more times (27) incorrectly to Pepsi Соја. Asa matter 
of fact, one 5 identified Coca Cola as Seven Up. As for Pepsi Cola, al- 
though it is correctly identified 21 times, it is more often (27 times) mis- 
identified as Coca Cola. In line with our hypothesis, RC Cola is iden- 


Table 2a 
Identification of Cola Beverages by 60 Ss When Each 5 was Presented 
Four Samples of Same Brand 
n ooo 
Brands of Cola Presented 
Тав Соса Соја Pepsi Соја R.C. Cola Vess Cola Totals 
en’ - p— ae ој ee —— c. — — 
cation No % No % No % Мо. % No. % 
Correct 25 42 21 35 14 23 ЕЈ 5 63 26 
Incorrect 35 58 39 65 46 Т 57 95 m 74 
Totals 60 100 60 100 60 100 60 100 240 100 


tified a fewer number of times than the other two drinks (here 14) and a 
greater number of times as Coca Cola (19) and Pepsi Cola (20). 

Again, in line with our suppositions, although Vess Cola is “correctly 
identified" three times, it is misidentified six times as frequently (18 times) 
as Pepsi Cola, and on 16 occasions as RC. These findings add further 
support to our hypothesis that the identification response 18 not a func- 
tion of the physico-chemical properties of the stimulus objects but a mat- 
ter of using an available verbal tag or label for it. The best evidence for 
such an assertion was obtained during the course of the experiment from 
the Ss’ spontaneous remarks. After having judged the second or third 
sample, 8 would frequently make а remark similar to the following culled 
from our protocols: “Let’s see, what is the other Cola?” ; ‘I’m only ac- 
quainted with three colas"; and “I can't even think of any others." 

In terms of percentages of correct responses, Table 2a will show a 
close approximation to those of Table 1a of Part I. Only 267% of identi- 


308 М. Н. Pronko and Ј. W. Bowles, Jr. 


fications are correct here where each S gets four samples of the same 
beverage as compared with 31% correct where four different beverages 
are sampled. 

Thus, when four cola drinks are presented, one of which is a "dark 
horse," the pattern of naming appears to be in terms of the names of the 
better known brands rather than the actual beverages used. It sug- 
gests that perhaps the Ss’ identification responses are a function of famili- 
arity of the verbal label as met with in their reactional biographies, 
through contact with the actual beverages, advertising posters, etc. 

If this latter hypothesis is correct, then the frequency of naming re 
sponses for each of the four stimuli employed should show a patterning 
related to a familiarity with the names of the products and not to an 
actual taste discrimination and identification of the four stimuli. For 
example, the distribution of the 132 Coca Cola identification responses 
when four different stimuli were actually used should be distributed by 
chance. And so for the others. Actually, significance ratios applied to 
the difference between the frequency of our Ss’ identification responses 
and the expected frequency (Table 3) substantiate our premise, since the 
ratios here are consistently low, in one case only reaching as high as 2.13. 
Apparently, then, our Ss did not actually discriminate the four stimuli 
but merely applied a name from an available repertoire. When a third 
or fourth name was not readily recalled for identifying the third or 
fourth stimulus, Ss tended to repeat one of those previously used. 

In our opinion, such evidence indicates that the four beverages were 
not differentiated on gustatory grounds. Instead, S tried to think of a 
number of different names to apply to the four stimuli whether they were 

erent as in Part I or the same as in Part II. Note Table 4 which 
presents significance ratios of the preceding analysis. In 14 cases the 
ratios are considerably below 1; in two cases only are they above 1, these 
being 1.50 and 1.06. Additional support is furnished in Table 5 which 
indicates no statistical significance in the differences of the naming re- 
Sponse patterns when Ss get four different colas or four identical colas. 

Should further argument be needed for the validity of our inter- 
pretation, the reader is referred to the data presented in Table 6, which 
shows the correct identifications as well as mis-identifications of those of 
Pu St who expressed definite likes or dislikes for certain of the Cola 
к=з? жо 2 indicates the number of Ss that preferred or dis- 

е сога beverages listed in Column 1. Column 3 shows the number 


of the same Ss who correetly identified those beverages. But Column 4 


indicates that from one-third to over two-thirds of those same Ss applied 


the Same name to one or more of the other cola drinks than the preferred 
or disliked one. For example, of the 61 Ss who expressed a preference for 


| 


er tr ют мо ог = — 990 20 18 290 ; 
(o 9 oy 9 wo 9 ою 00 00 29 9070 о ор он 
шо »r 00 se so 0 sr so 10 9L 190 9 vio) 18094 
оу 161 er ст 09 г sr 80 10 су 990 £0 800 зоору 
ou 7» pa oyy э» ра ony e "gud oy та ва feet 
а од мод sy мод ‘OU зү јод wdoq sy огу vooo sv 
poynuop] моң 
mung oT, jvmoy Jo тета оцу uo ION 92v sosvzoAoq- HOD NOT OT) 0} 
(ижишәфхд јо П Hug) sosuodsoy uonvogruep enotvA oq; уо VORNGIISIC oq узчу smoqod4H oq Jo MSAT, ONVY UVOZI 
3 Р y AWL 
© 
= а 
- 00 о sr 190 10 IF 60 — cO 69" 190 #0 BOD 889A. 
E — wo о ort 00 60 т 20 9 0c те 10 об ‘OU 
= «г о 8L T0 — юю т 10 oor сл 670 90 BION sdg 
$ — wo о ert 2 90 m 610 20 OT 0 60 BIO 820 
= omy ™» ‘wa опен т. иа oy =o а oyy то а Le 8 
«100 вод sy мод ОН SV won 15094 sy 3100 920) Sy 
pognuop] моң 


gnumg OSV, [vmoy jo sug oq) uo JON OU s93919A0g VON по әчү OF 
(учәплзәйхд jo ү ыча) sosuodso)r uonsognuop] snotrvA eq) jo попадат oq 3o sisoqyod&q oq jo #591, ОПЕМ eouvagrudtg 


£ 91491, 


a 


310 М. Н. Pronko and Ј. W. Bowles, Jr. 
Table 5 
Significance Tests to Determine Whether Differences between Percentages in 
Results of Part I and Part П are Real 
шшш __———— ——-————Є————Є—Є—ЄЄ——_—_— 
Brands of Cola Presented 
Cocs Pepsi RC Vess 
Statistic Cola Cola Cola Cola Totals 

Р, (correctly identified 

in Part I) 43% 46% 36% 1% 31% 
Р, (correctly identified 

in Part IT) 42% 35% 23% 5% 26% 
P-P 1% 11% 13% 4% 5% 
БЕ 3.04 28.37 25.09 2.69 6.30 
Significance Ratios 33 .39 · 52 149 79 


Coca Cola, 28 “correctly identified” it but 19 of the 28 also gave the same 
naming response one or more times to either Pepsi Cola, RC Cola or Vess 
Cola. Of the two Ss who express a preference for RC, neither identifies 
this beverage correctly. Other similar examples may be seen in Table 6. 


Table 6 


Showing Number of Ss (in Part I of the Experiment) Correctly Identifying and 
Mis-identifying Their Preferred or Disliked Brand of Cola 


No. of Same 
“Correctly” 
пае А 
rence 
No. of No. of Also Mis-identified 
кышы орача. Qne or More Ohar 
rrectly” rands as the 
Preferred Beverage Identifying Preferred 
Coca Cola 61 28 19 
Pepsi Cola 31 19 7 
Vess Cola 0 к = 
No. of Same 
“Correctly” 
‘slike Who 
Хо, of No. of Also Mis-identified 


bris Ss Same Ss One or More Other 
Вогез Disliking “Correctly” ^ Brands as the One 
Beverage Identifying isliked 


Coca Cola 6 2 
Pepsi Cola 10 5 2 
ЕС Соја 19 5 3 


Identification of Cola Beverages, I au 


One last point needs to be made. An examination of correct responses 
in Table 1 could possibly be interpreted as showing a trend toward correct. 
identification but we believe such a "trend" to be spurious and a mere 
function of applying the most familiar brand names. This is indicated 
in Table 2 both by the infrequency of names of cola drinks not used in the 
experiment as well as infrequency of correct identifieation of our fourth 
cola brand. Note that Cleo Cola is employed as a label 10 times as com- 
pared with the name of the actual beverage (Vess Cola) which shows up 
only four times. Again, Vess Cola in both Parts I and II rather than 
being identified as an unknown brand is misidentified as one of the three 
popular drinks. Furthermore, we have already called attention to the 
frequent "searching around” for a third or fourth name on the part of S, 
when he did not readily recall a variety of Cola names. 

In addition, it will be recalled that the data of Table 6 revealed a 
frequent repetition of the names of the correctly identified popular Colas 
to misidentification of the lesser known. Furthermore, we must stress 
the fact that there is no “trend” ог clustering of correct responses in Part. 
II of the experiment as shown in Tables 2 and 2a. Although Vess Cola 
is presented 60 times here it is correctly identified only three times but 
is misidentified a total of 54 times as Coca Cola, Pepsi Cola, or RC Cola. 
It is failure on the part of S to use this naming response that spuriously 
overloads the cells of the other three brand names. Finally, statistical 
tests of the distribution of naming responses for each of the four brands 
do not justify rejection of the null hypothesis. In conclusion, while the 
Writers аге of the opinion that the foregoing evidence is sufficeint to estab- 
lish their interpretation of the facts, further investigation is necessary to 
corroborate these findings. 


Summary 


A group of 108 Ss was asked to identify one oz. samples of each of the 
following four Colas: Coca Cola, Pepsi Cola, RC or Royal Crown Cola, 
and Vess Cola, presented in counter-balanced order. Ss’ identification 
responses (Total = 432) were distributed as follows: Coca Cola, 132; 
Pepsi Cola, 147; RC Cola, 112; Vess Cola, 4; Cleo Cola, 10; Rock Cola, 
2; Dr. Pepper, 6; Other, 7; Don’t Know, 12. 


l. There was a marked tendency to use only the three most familiar 

*ategories in the naming response so that the fourth brand was misiden- 

ed for the most part as one or the other of the three familiar colas. 
was also true for the other three brands. 

2. Another group of 60 Ss, distributed into four subgroups of 15 Ss, 

“ach group being presented with four one ол. samples of the same beverage 

during the four successive trials, showed a similar tendency in its verbal 


312 | М. Н. Pronko and Ј. W. Bowles, Jr. 


Misidentification of Vess Cola with the other drinks was 
most frequent but the popular brands were also commonly misidentified 
with one another. 

3. Both groups also showed a slight frequency of misidentification of 
of the four brands with lesser known colas and other soft drinks not used 
in the experiment. 

4, Although Ss who preferred or disliked a Cola drink did sometimes 
identify it correctly, nevertheless, from one-third to over two-thirds аз 
often they also called one or more of the other cola drinks by the same 
name. 

5. It was noted that after the second or third taste judgment, Ss 
frequently and spontaneously remarked that they could not recall any 
other cola names and would repeat a previously used brand name. 

6. Statistical tests of the distribution of naming responses for each of 
the four brands further support the view that our Ss did not discriminate 
the four brands on a gustatory basis but rather applied a readily available 
repertoire of cola-naming responses. 

Received October 16, 1947. 


The Reliability of Job Evaluation Rankings * 


Philip Ash 
The Pennsylvania State College 


While a number of papers have been published recently concerning 
the validity and internal consistency of job evaluation systems (1), (3), 
(4), (7), little systematic study seems to have been made of the reliability 
of job evaluation ratings.! Rather, on the one hand reliability has been 
assured (or assumed) by the rating methods used, or on the other hand the 
limitations demonstrated in connection with personality and other trait 
tatings have been imputed to job evaluation ratings (5), (8). 

The writer believes that a great deal of merit inheres in both of these 
Positions. In many companies using formal job evaluation systems job 
evaluation ratings are assigned by a process of group discussion and 
compromise. This practice reduces considerably the importance of in- 
dependent rater-to-rater consistency. It is also true that a priori analysis 
Would reveal a great similarity between the process of assigning a point 
rating to a job with respect to, say, “physical effort requirements," and 
the process of assigning a rating to an individual with respect to, say, 
“sociability.” To that extent the results of study of the latter type of 
tating judgment apply equally to the former. 

А The problem vis-a-vis job evaluation ratings still warrants independent 
«vestigation, however. In large companies it is common to find jobs 
“reallocated” and new jobs rated by a single analyst, for whose judgment 
there is no measure of reliability. Furthermore, there is good reason to 
believe that the analogy of personality trait ratings does not cover all 
Salient points, ay 

This paper reports a single study of the reliability of rating judgments 
made by trained job analysts. It is believed that some of the results 
Presented here will be applicable to any job evaluation installation, and 
Vill suggest directions for future investigation. 

Di * Grateful acknowledgment is made for permission to use data from the files of the 
"vision of Occupational Analysis. Responsibility for all opinions and conclusions 
contained herein, however, is solely that of the writer. 1 
" Since this paper was written, a study very similar in design has been published by 
Wshe, C. H., and Wilson, В. F, Studies in job evaluation. 6. The reliability of two 
Point-rating systems. J. appl. Psychol, 1947, 31, 355-365. 
313 


314 Philip Ash 


Description of the Project 


One of the products of occupational research conducted by the Divi- 
sion of Occupational Analysis is Part IV of the Dictionary of Occupational 
Titles (9). This volume offers a classification of occupations into homo- 
geneous groups, "families" or “felds of work.” In each group the oc 
cupations are related on the basis of similarities in knowledges, skills, 
aptitude requirements, and other functional criteria. However, in pre- 
paring a revision of the volume it was decided to order the occupations 
within each group in terms of skill level. The skill level hierarchy could 
be used to indicate points of entry, promotional sequences, and channels 
of advancement. To accomplish this skill-level ordering, a point-rating 
job evaluation system was devised. Since it would not have been feasible 
to hold group rating discussions for each of the twenty-three thousand 
occupations to be included, some estimate of the probable reliability of 
individual analyst ratings was needed. A pilot study was therefore de- 
vised for this purpose. 

Five questions were posed for investigation: 


1. How reliably (i.e., does the average analyst rate jobs? rater 
rater consistency), 


| 2. Do differences in consistency of rating appear between the various 
job evaluation factors? 

: 3. Are there any factors for which particular jobs cannot be rated con- | 
=; even though the over-all consistency of rating on these faciem 
18 hig 

4. Ате there any jobs which cannot be rated consistently, due to lack 
of information or for other reasons? 
5. To what extent do the factors overlap? 


The Job Evaluation System. A study was made of twenty-two of the 
most widely-used job evaluation plans. Nine factors, including what 
appeared to be all relevant components of the skill level of a job, were 
selected. and defined. The factors included knowledge, physical skills, 
adaptability and resourcefulness, responsibility for material goods, respo 
sibility for safety, responsibility for the work of others, physical effort, atiet 
tion, and working conditions. The definitions were given in considerable 
detail. For example, the definition for the factor physical skills read: 

The dexterities, coordination, and muscular discriminations req? 
for successful manipulation of materials, tools, machines, or equipment 
effect successful job performance. Evaluate developed physical ski 
involving any one or combination of: dexterity of fingers or other шеп” 
bers, coordination of senses, hands and/or feet. Consider the сопарје 
of necessary movements and the frequency and speed demanded; de 


Reliability of Job Evaluation Ranking 315 


of coordination required between sensory cues and movement responses; 
accuracy or precision of movements or muscular judgments required; 
repetitiveness of movements; independence of finger, hand, foot, and/or 
leg movements." 

The Job Sample. A sample of twenty-seven occupational descriptions 
was selected for the purposes of this study. Each description was a 
composite summary of information collected in from two to twenty-five 
independent job analyses. A description therefore constituted a detailed 
statement of the characteristics typically associated with the occupation, 
together with an indication of deviations from this average in the plants 
in which the basic job analyses were prepared. Each description included 
а statement of the duties performed in the occupation, machines, tools, 
and other work-aids used, working conditions and hazards involved, 
hiring requirements (sex, education, previous experience), promotional 
lines, and estimated worker characteristics (degrees of strength, dexterity, 
intellectual ability, and so forth) required for successful performance. 

It should perhaps be pointed out that the findings in this study relate 
to the reliability of ratings of the job descriptions. No analysis was made 
of the objectivity, reliability, or validity of these descriptions vis-a-vis 
the actual jobs. 

This is a problem common to all job evaluation programs, and one 
Which might well merit study. What is usually subjected to rating is a 
brief abstract from the mass of data that constitutes a job. It may well 
be that points decisive for a fair skill level evaluation are overlooked, that 
biases and deficiences in the original observations seriously distort rating 
judgments. 

It may be pointed out, however, that in view of the fact that the 
Division has had over thirteen years’ experience in developing the tech- 
hiques of job analysis, and has established methods that yield consistent 
agreement among job analysts with respect to the characteristics of jobs 
Studied, there is a presumption in favor of the validity of the descriptions 
used here, 

The specific jobs included in the sample were: bookkeeper, bootblack, 
Cabinetmaker, clerical checker, chef, ditch digger, deep-sea diver, spinning 
doffer, heating and ventilating equipment draftsman, garment, factory 
reman, gardener, hat designer, machine oiler, metal machinist, olive 
Packer, paper cutter (machine), physician (on a ship), plasterer, bull- 

© pourer, president of a refrigerator manufacturing company, punch- 
Press operator, housefurnishings salesman (retail), petroleum stillman, 
Hand trucker, typist, night watchman, window calker. This range of 
Jobs is considerably greater than that found in a typical plant. It is 
Probably only а very restricted sample of the jobs in which the USES 
5 interested, rit eie 


316 Philip Ash 


The Analysis. Ten job analysts participated in the study. All of 
them had had considerable experience in job analysis, occupational classi- 
fication, and related phases of occupational research. The range of ex- 
perience was from two to twelve years; the mean for the group was 48 
years’ experience. In addition, a training session was held to discuss the 
purposes of the study, to review the factor definitions, and to ensure that 
uniform procedures would be followed. 

Procedures. Each analyst was provided with a complete set of the 
descriptions and the factor definitions. Since no point-values had been 
established, the analysts were instructed to rank the sample for each 
factor, treating each factor independently to reduce any halo effect which 
might operate. The job ranking lowest was assigned the number “1,” 
the job which ranked highest was assigned the number 27, As neces 
sary, adjustments were made for ties. 


Statistical Findings 


Reliability of the Raters. For each job, the median of the analysts’ 
ranks for each factor was determined. "The rankings of each analyst were 
correlated with the median array for each factor. Using the resulting 
correlation matrix a median coefficient was determined for each factor 
and for each analyst. 

In addition, for each factor an average intercorrelation coefficient was 
calculated. These coefficients are given in Table 1. 

Analysis of Table 1 suggests that, given trained analysts, a very high 
degree of consistency in job evaluation ratings may be obtained. The 
range of the reliability of the analysts, expressed in terms of the median of 
the correlations made by the analyst on the nine factors, is from .81 to 94. 

Furthermore, of the ninety coefficients giving the correlation of the 
rankings of an analyst on a factor with the median rankings for the factor, 
forty-nine exceed .90, twenty-six are in the range .80 through .89, twelve 
are in the range .70 through .79, and only one is very low (.25). This 
last is the only coefficient which might have been obtained merely on the 
basis of chance expectancy. 

Differences in Consistency with Respect to the Factors. The reliability 
of the differences among the various correlation coefficients was not cal 
culated. Examination of the quantities in Table 1 will indicate the 
general picture, however. 

i The average intercorrelations are perhaps the most pertinent indices 
with respect to factor consistency. As is to be expected, they are some 
what lower than the mdeians of the individual correlations for each factor 
The average intercorrelations suggest the probable magnitude of the те 
liability of a single ranking. They are therefore more revealing in br 


InD—"-——————————————————— A ÀI—— MÀ — —]À——————————— 
p -—————ÁA Y 
—Ài MN 


317 


- "d ‹(-рә ровтлол) sa юшәчүрш. ay} pup в2лтргоола 
Toons "ST100A UBA pus S1ojeq 998 олпреоола попвјоллоолојит uii d dug ‘0 0} x cmd jo 10119 PIVPUBIS OUT, ++ 


"203287 чәчә uo 4sÁ[wuv YORE 10} juorogjooo ovo 0; срео V “961° St ‘0 оў = 4 us Jo 10119 prepuv)s OUT, + 


—— HA M (e олы сыы ыгы NM с _ с 


> (as&peuv) 

i 68 $6 % I8 26: I6 — $6 28 06 18' 4 USPON 

Š 16 z6 06 £6 96 c6 26 16 76 Y6 08705 — 96: SUOTIPUOD 3up10 A, 

E ве" LL LL 96 — 69 18 99 18 LL Oc "gm ТЕ uonuoy y 
#9" 16 68 gi wp Or 96 98 +6 zs £6 IS зоря [eors&qq 

E 9” 16 26 06 Р6' 16 78 96 88 ЁЛ 76 sm ој 

Š JO узом 10] Aqyqrsuodsayy 

© 19' 68 98 ©8` ©6` LL +6 96 28 Z6 26 LL Ayoyng 10} jq msuodsong 

© M: 18 88 98 - ва --+ 28 98 16 #6 OL ДЕ Р SIENE 10] Ауцтатвпойвәуү 

iS LL 98 6L ете 706 88 18 #6 9 сё PL БА 

~ £6" 96 86 се — 96 Y6 96 16 96 8 06 6 Ayiqeidepy 

= +68" 96 16 96 96 56 96 86 ТӨЗЕ ее 93pojwouy 

SO 

5 4 4 f I H 5 a a a D я У 81048 

S "OAV изрәрү = 

Š sysAqpeuy 


*peA[oAur ore sjsAqpeus пој 'sqof 7c :93o0N 
влоўәв,] UOHVN[VA чог eurN uo вЗшҳизу jo Ayqey 


T әче, 


318 Philip Ash 


lation to the question of independent rater-to-rater consistency. It 
should be noted that, while these coefficients all exceed chance expectancy 
by а wide margin, they range in magnitude from .39 (“attention”) to 
93 (“adaptability”). 

It is pertinent to observe, with respect to the factor “attention,” how- 
ever, that the coefficient reflects the extreme disagreements of analyst C. 
If analyst C's rankings were dropped, the average intercorrelation coeffi- 
cient would be raised to .59, which would be comparable to those for 
“physical effort" and “responsibility for the work of others.” 

As a guide to action, it seems reasonable to conclude that the job in- 
formation available and the understanding: of the definitions were satis- 
factory for the factors “knowledge,” “adaptability,” and “working 
conditions.” On the other hand, improvement could be sought in the re- 
maining six factors. 

This is borne out by an analysis of the range of ranks assigned each 
job for each factor. The average rank-range for “knowledge,” “adapta- | 
bility”, and “working conditions” was 6.6 ranks, 7.7 ranks, and 8.6 ranks 
respectively. On the other hand, the average rank-range for “attention” 
was 15.4 ranks, for "physical skills" it was 11.5 ranks, and for “respon 
sibility for materials" it was 11.4 ranks. The spread of rankings for a 
job on a faetor provides a rough index of the consistency with which the 
job is rated on the factor. 

Consistency of Ratings of Jobs. Paterson (6) has suggested that it is 
questionable whether all job evaluation factors are equally applicable ta 
all jobs to which they are applied. In an effort to determine whether any 
of the factors were inapplicable to any jobs, or whether any of the jobs ia 
the sample were particularly difficult to rank with respect to particulat 
factors, the analysts were asked to report their reactions in this respect 

The analysts were uniformly of the opinion that all the factors welt 
applicable to every job, if only to establish a point of the ranking com 
tinuum. For example, “responsibility for the work of others," whic 
may be interpreted broadly as supervisory responsibility, was deeme 
applicable to clearly non-supervisory jobs. The factor provided a bas 
for making a skill level discrimination in this area. The same comment 
is applicable to any of the factors, on the proposition that absence of the 
factor from the job is itself pertinent to evaluation of the job. 

With respect to the.particular jobs included in the sample, howevel 
the question of applicability frequently became one of adequacy of infor 
mation. For four jobs the rank-range on each factor was wide. 2% 

analysts reported that these were jobs for which the information was too 
ambiguous or too scanty to make a ranking judgment with confident? 
It was also found that, for particular jobs, personal biases of one Ki 


Reliability of Job Evaluation Ranking 319 


her operated in one or more of the factors. These led to inter- 
йз at variance with the interpretations placed either on the job or 
stor by the group as a whole. For example, should the Physician's 
mance of operations be ranked low on “working conditions"? To 
is a worker responsible for the safety of others when he works 
[places and may drop his tools on another worker? 

he information collected served as a basis for reconsideration of 
definitions, and pointed to the need for trial runs to ensure that 
definitions adequately cover all variations of application of the 


actor Overlap. In view of the small size of the sample and the defi- 
ies revealed in the data and the factor definitions, only а hasty study 
nade of factor overlap. The correlations between each pair of me- 
rankings was calculated. These correlations are given in Table 2. 
[t is obvious from Table 2 that the factors do not represent inde- 


Table 2 


Intercorrelations Among Nine Job Evaluation Factors 
Note: 27 jobs, correlations of median ranks of 10 analysts. 


2 3 4 5 6 7 8 9 


м 


A Me CIN; Ur =: 
pi 03 84 -12 — 


ht variables. In fact, some of them seem to be almost identical 


wledge” and *adaptability"). 


Summary and Conclusions 


а pilot study designed to determine the reliability of job evaluation. 

3 made by trained analysts, ten analysts ranked twenty-seven jo 
factors. The following conclusions seem justifiable: 

That in general a high degree of reliability of analyst judgment may 

‘icipated. 


320 Philip Ash 


2. That consistency of rating appears to be in part at least a function 
of the factor rated, and in part a function of the job information available. 
It would therefore seem desirable to determine analyst reliability for each 
factor independently, and to make adjustments accordingly. 

3. That in a set of as many as nine factors it is probable that at least 
some of the factors overlap and may be dispensed with. | Iowever, it 
may be that the elimination of one of a pair of correlated factors will alter 
appreciably the interpretation of the remaining factor, which will absorb 
some of the implications of the deleted factor. It would seem desirable to 
discover whether the overall order of jobs does in fact remain the same 
when such eliminations are made. 


noted. In the first place, the reliability noted relates to the content of 
job descriptions; whether the rankings are valid for the jobs thc 1selves re- 
mains to be determined. In the second place, the great variability of 
jobs in this sample, as compared with the array of jobs usually found та 
single plant, probably tended to increase the reliability coefficients. 
Finally, since ranks were used in this study, rather than a longer point- 
rating continuum, a slight loss in comparability with the usual job evalua- 
tion plan has possibly resulted. 


Received July 14, 1947. 

References 
1. — E ; el study of а job evaluation point system. Modern Mgt» 
2 dera: 5 S., and McAuley, T. М. Salary evaluation. Personnel, 1941, 18, 


3. Lawshe, C. Н. Towards simplified job evaluation. Personnel, 1945, 22, 153-160. 
4. Lawshe, C. HL, and others. Studies in job evaluation: Part 1. Factor analysis of 


1945, 29, 177-184; 1946, 30, 117-198, 310-319, 426-1 
5. ү H. Problems and methods in job evaluation. J. consuli. Psychol., 1944, 


, a 
6. Paterson, D. A statistical basis for setting wage rates. Amer. stat. ass’n J., 1989, 


СА Rogers, О; Analysis of two point-rating i i 1. Psychol., 
1946, 30, 57 : point-rating job evaluation plans. J. appl. Psy 

8. Viteles, M.S. А psychologist looks at job evaluation, Personnel, 1941, 17, 165-176. 

9. War Manpower Commission, Division of Occupational Analysis. Dictionary of occu- 


pational titles: Part 4. Entry occupati, : nnda 
Office, Washington,§1944, Pational classification. Governmen 


Special Review: Psychology in ап Ideal University * 


Walter V. Bingham 
1661 Crescent Place, Washington 9, D. C. 


"The place of psychology in an ideal university is the theme of a little 
ik which psychologists are reading with relish because it portrays the 
t flowering of their young science into a cluster of technologies, and 
ts to prevailing academic shortcomings which will be corrected only 
iter recognition of these developments. 

_ This fascinating brochure is the report of a University Commission 
inted by President Conant in the spring of 1945 to advise on the 
e of psychology at Harvard. He had asked the Commission ques- 
lions like these: 

“What different types of professional ychologists should be trained by 
University and in what faculties? t coordination, if any, should we 
pt between the work of the psychologists and the ychiatrists in the 
nt faculties? What instruction in Psychology is 2 esirable part of the 
education of undergraduates or students in our various рони 


chools? Should we recognize the different ty of Psychology by suitable 
els on our professorships and separate methods of reviewing permanent 


ointm ver-all committee on 

ову which ог should we айешру М A appointmenta in the whole eld” 
Under the chairmanship of the Rockefeller Foundation’s Director for 
the Medical Sciences, the Commission consisted of six eminent psycholo- 
gists and of six equally eminent representatives of related fields, four of 
them connected with medicine. | 
- The report is in three parts. First the reader is reminded of the 


e and the vastly expanded range of psychology; its methods, both 


tional and novel; its unrealized potentialities for the whole field of 
made to the general 


‘education ; and the strength and cogency of the appeal М 
‘public as well as to students by this science and by its developing psycho- 
technologies. From this section a reviewer is tempted to quote several 


Pungent paragraphs like this: 
statements on education show an 


- "Most of the current reports and polic. 

Almost exclusive concern with curricula and their presumed effects. BELT 
l meant, this ignorance of the realities tira dox Wee x ГЕ : 
haire effect. sence in а dean's office (or in 

i Ee. Actual nsiderations—namely, the 


" " Alan Gregg and others. The place of psychology in an ideal university. Cambridge: 
| [кеч University Press, 1947. Pp. x442. $1.50. 
a 321 5 


322 ; Walter V. Bingham 


psychological adjustments of individual human beings, of teachers as 
students, to the processes aimed at in the curricula. Do not the moti tio 
and personality of the student matter in education? Are we wise or even 
in oling as negligible or unmanageable the varieties of student capaeitii 
tastes, and temperaments? Does not this aspect of education in a free socie 
need much more attention and explicit care? We insist that it does," 
And again: “The threat that psychology may pass into the hands of 
not recognized by the profession is not an idle one. N 
the only professional group whose status 
intransigence. What physicians refuse to 
advantage of medicine as a field of learnin 
for intelligent applications of psychological 
diagnostic study, educative treatment, an 
sumptively normal individual." 


g. The major op E 
knowledge and skills appears 
d vocational guidance of the 


In Part 2 the Commission reviews the main purposes of psy 
in a university. These are пој merely to make available to beginn 
and to advanced students a body of established facts and laws—an 
cable content. Most valuable is the revelation that so many h 
functions, not excluding those of emotion, motivation and ways of 
ing, can be approached ‘objectively, dispassionately, scientifically, he 
value of other fields of scholarship to the student of psychology, and of 
psychology to students in other fields and in the professional schools, is 
appraised. | 

After these preliminaries the Commission briskly approaches its more 
specific task of formulating recommendations regarding a plan of polic у, 
organization, courses and facilities for a Department of Psychology. | ' і 

Five points of policy are called axiomatic: 


“For the young and growin 
freedom for change and evel 

tween all the psychologists in t н 
the department shoul i and 
Sources interest and r uu 


g subject that is 


versity should be attached to the Department 
The present complete independence of a psychologist 
rom the psychologists in the Department weakens 
lates the individual from the main-stream of the 3 


ine, business, engineering, teaching, the mi 
and government. 


Psychology in an Ideal University 323 


“For which of these professions is а deliberate ignorance of the subjects 
recommended an advantage? And in which of these subjects may one assume 
^J modern psychology can add nothing to the current credos and incredu- 

These courses would be taught by psychologists holding joint appoint- 
ments in their professional schools and in the Department of Psychology. 

“Jf administrative arrangements make this difficult they should be changed. 
; . . Strong as may ђе the tradition of separatism and перса the various 
faculties of а university, a larger measure of interchange, boration, and 
mutual confidence between the faculties deserves a ten years' trial." 

A professional degree, the doctorate in psychology, should be given in 
the Department of Psychology to persons it has trained as practicing 
psychologists. Concurrence in this recommendation on the part of mem- 
bers of the Commission was not unanimous, although they were all in 
hearty agreement that the ideal university must offer much more thorough 
training and supervised experience than heretofore to prepare students 
to enter on the practice of a psychological profession. The report does 
not seriously discuss the possibility that training for this Psy.D. degree 
might be entrusted to a new professional school of psychology analogous 
to those which prepare for the professions of divinity, business, law, edu- 
cation or medicine, leaving the basic science to be developed and taught 
by the academic department in the faculty of arts and sciences. | 

Тће profession of psychology is in its adolescence, а period of rapid 
growth. Reaching out gawkily in many directions, only a few of which 
are touched upon in this little volume, it is struggling toward a mature 
professional status comparable with the honored academic status it has 
maintained for sixty years. , 

Since the Harvard Commission surveyed the scene, the American 
Psychological Association has undertaken its new responsibilities of ex- 
amining for certification as to professional competence those оласе 
bers who want to practice in one or more of several specified fields of ap- 
plication. Under these circumstances, why not postpone consideration of 
any new professional degree until sufficient experience has given а clear 
answer to the question as to whether certification cannot be so well ad- 
ministered that it accomplishes what is requi ed in the way of definition 
and maintenance of professional standards? ; i 

One other recommendation in the report, that concerning unity of 
' Organization in a single department, is provoking lively argument. The 
comprehensive department, as the Commission calls it, has long since 
demonstrated its strength and fertility in à great expanding university 
such as Ohio State. On the other hand, the fact that progress can also be 
made and standards maintained in à university where several relatively 
independent units exist, in the college of liberal arts, the school of edu- 


324 Walter V. Bingham 


cation, and such professional schools as business, engineering, law 
medicine, is supported by a glance in the direction of Chicago, for example, 
or Michigan, or most spectacularly, Harvard itself, where the centrifugal: 
force has been extreme. The shepherd of a large flock of psychologists, 
of whom the most brilliant are not unlikely to include individualista 
deeply immersed in their unique problems, has no easy task to keep them 
in one fold—in one community of scholars when they want to form at. 
least seven sharply separated communities named Experimental Psychol- 
ogy, Social Psychology, Clinical Psychology, Psychobiology, Industrial 
Psychology, Educational Psychology, and Medical Psychology; while & 
few find isolated shelter and plenty of work in an institute for child devel- 
opment or a school for nurses or for psychiatric social workers, or in the 
offices of admissions, student health, the vocational and educational 
counseling service, or a bureau dedicated to reduction of highway traffic 
accidents. 

Intermediate between the extremes typified by Harvard and Ohio 
State are universities such as Minnesota where team work among all the 
psychologists whatever their duties has been effectively encouraged partly 
through a relatively simple о izational structure but chiefly by long- 
continued planned cultivation of the purpose to collaborate. 

The suggestion has been offered that a sharing of mutual helpfulness 
among the psychologists of a great university would be facilitated by 
bringing them together under one roof. A generous alumnus might 
provide one vast building to house the entire group; and this should help 
even though certain psychologists would yearn for closer intimacy with 
colleagues in other departments: government, labor relations, neural anat- 
omy, machine design, biochemistry, Sociology, or the Dean's office; and 
а few, preferring insulation from their own kind, would seek hide-aways 
far from this central building. 

PaA After all, community among scholars is fostered principally by the - 
will of each to share actively in the thinking and interests of his fellows. 
In an ideal university, this purpose is widespread and rooted deep. 

There are times when the drive toward organizational growth by 
fission becomes powerful, as it did in the American Psychological Associa- 
tion during the 7308 when hundreds of practicing psychologists, fed up 
With years of relatively sterile programs and the narrowly academic 


sociation for Applied Psychology which promptly burgeoned into a vigor- 
OUS, aggressive enterprise. The parent group then awoke to the fact that 
the majority of its members were practitioners, not teachers, of psy- 
chology. Facing this reality, its policies, structure and programs were 
drastically revamped, and the secessionists willingly came home. 

So it may be, some da , in the Ideal University. 


Correction 


To the Editor: 

In the article "Communication Between Management and Workers” 
by Paterson and Jenkins which appeared in the February, 1948, issue of 
Journal of Applied Psychology, is an erroneous statement on page 72 
which I should like to correct. The statement is as follows: “Presumably, 
information about scientific management procedures and techniques is 
transmitted from management to workers by verbal means only.” 

This is an unwarranted assumption and contrary to the facts. Refer- 
ences to the following pages (and others), in my book “Industrial Man- 
agement in Transition” will serve as evidence on this point: Pages 63, 
70, 162, 248, 253, 269. For example, on page 70 where “mechanisms” 
are discussed, one of the mechanisms listed is “instruction cards for the 
workmen.” These, of course, are written. In Taylor’s work, the im- 
portance of "standard written instructions” is emphasized, although no 
emphasis was placed upon typography. The charts of Henry L. Gantt 
were also means of communication as were the Gilbreth, and more recently, 


Mogenson’s micromotion studies. ирке 
Signed: George Filipetti 


Professor of Economics and Business Administration, 
University of Minnesota 


Reply 

Dr. Filipetti is technically correct. Incidental mention is made of 
written instructions and communications on the pages cited, but nowhere 
is the problem of scientifically constructed written communications to 
employees mentioned. The index of the book seems to be deficient. 
Nowhere can we find in the index such items as instruction cards, com- 
munications, written communications, charts, means of communication, 
understanding of written instructions, or апу other term suggesting that 
scientific management people seriously considered, at any time, evo 
E of insuring comprehension of management communications on the 
of workers. \ 
Signed: Donald G. Paterson and James J. Jenkins 


Department of Psychology, 
University of Minnesota 


Book Reviews 


Zeisel, Hans. Say it with figures. New York: Harper and Brothers, 
1947. Pp. 250. $3.00. 


This book is written by Hans Zeisel, who was in Vienna with Lazars- 
feld, and who has worked with him in the Columbia Bureau of Social 
Research, and is now Manager of Research Development for McCann- 
Erickson. It deals with methods for handling questionnaire data of the 
sort, obtained in attitude, opinion, and consumer research. The book is 
unique, has an excellent selection of topics, and overlaps zero with the 
usual statistics book. In fact, it isn’t statistics at all, but rather is com- 
mon sense and a little arithmetic. Zeisel demonstrates his points with 
abundant research findings from the files of the Bureau of Social Research, 
from advertising, and from public opinion results. It is а most welcome 
and needed addition to the literature. 


Kenneth E. Clark 
The University of Minnesota 


Gordon, H. Phoebe, Densford, Katherine J. and Williamson, Edmund C. 
Counseling in schools of Nursing. New York: McGraw-Hill, 1947. pp. 

xiii + 279. $3.00. 

This book has been prepared for administrators, teachers, supervisors, 
head nurses and all other individuals who, through their contacts with 
students in schools of nursing, contribute to the success of the adjust- 
ments which students make. The authors hope the book will also be of 
value to hospital administrators in helping them to develop greater un- 
derstanding of the problems of student nurses, 

The book is divided into four parts. Part One, The Professional Back- 
ground of the Student in Nursing, includes a brief presentation of the 
ing as a service and as a profession. The 


types of organizations including hospitals, general and special; publie 
health agencies; and out-patient departments. The curricular experi- 


a (в 1 е nized class-room instruction, and а clinie 
curriculum which is carried forth in а milieu of myriad relationships with 


326 


Book Reviews 327 


patients and many other individuals. Because of the complexity of 
present day nursing education, and the increased demands which are 
placed upon nurses, the importance of an adequate personnel program 
is stressed. 

In Part Two, Understanding the Student in Nursing, the psycholo- 
gical characteristics and the social background of student nurses are 
discussed. In Part Three, Counseling and Personnel Services in Schools 
of Nursing, seven chapters are devoted to discussions of the nature of 
student counseling and personnel services, measures used in selecting and 
counseling students, student orientation, student counseling service, dis- 
ciplinary counseling, student health programs, and extra-curricular acti- 
vities. Part Four, Developing the Personnel Program in a School of 
Nursing, presents problems which are involved in the organization of the 
program, and gives suggestions for both organization and continuing de- 
velopment of the program. 

This book, which comes at a time when nursing leaders and hospital 
administrators are seeking answers to pressing personnel problems, is a 
valuable contribution to nursing literature. The book is so well organized 
and the material so clearly presented that it should be of definite value to 
individuals who do not have a broad background in personnel work. 
Though no attempt is made to discuss the underlying philosophy of 
Personnel work as such, the interrelationships between the needs of the in- 
dividual and the needs of society are emphasized in every section of the 
book. The chapters on student counseling and student discipline seem 
of particular value. Though the employment of a qualified personnel 
director is considered an essential step in the development of the program, 
the counseling functions of all individuals who come in contact with 
Students are repeatedly emphasized. 

This book, in keeping with its underlying philosophy, is not in the least 
authoritative. The individual who looks for definite answers to prob- 


. lems of a particular situation will not find them. However, the person 


Who wishes to develop an understanding of the basic principles upon which 
the entire counseling program is founded, and to utilize those principles 
m developing a program which meets the needs of а particular situation 
Will find this book of great value. As а basic text in courses in personnel 
Work for graduate nurses, supplemented with suggested readings which 
^re given at the close of each chapter, this book will also fill a definite need. 


Helen Nahm 


Division of N. ursing Education, 
Duke University, Durham, N. C. 


328 Book Reviews 


Bonnardell R. L'adaptation de l'homme à son métier. (2nd ed.) Paris; 
Presses Universitaires de France, 1946. Рр. 199. 120 French francs. 


The title and, even more so, the sub-title (A study in social and in- 
dustrial psychology) indicate a broad scope. Actually, the content is 
limited to an exposition of psychometrics, with special reference to усеве 
tional guidance and industrial selection. The author, who started his 
research career by physicochemical studies on excitability of muscles and 
isolated nerves, utilizes to advantage both his knowledge of the theory of 
quantitative psychology and his extensive experience as an industrial 
psychologist for the Peugeot Automobile Company. After a survey of 
pseudo-scientifie (physiognomy, graphology) and traditional techniques 
of evaluating the aptitudes of applicants for employment (application 
blank, recommendations, interview, job trial), criticized on account of 
their subjectivity, the author presents the history and principles of 
psychometrics. Miniature job situations (“synthetic tests") are men- 
tioned but the emphasis is put on the analytical (componental) approach 
and the statistical treatment, applied to both the test scores and the 
criteria of job proficiency, with preference for Thurstone’s system of 
factor analysis. Steps to be taken in establishing psychometric services 
in a plant are described. 


Josef Brozek 
Laboratory of Physiological Hygiene, 
University of Minnesota 


Ryan, Thomas A. Work and effort: The psychology of production. New 
York: The Ronald Press Company, 1947. Pp. xii + 323. $4.50. 


: А few universities already have in their curricula a course entitled 

Experimental Industrial Psychology.” This book would make ап © 
cellent textbook for such a course and may stimulate the setting UP 
more courses of this nature. For Ryan does not make ex cathedra pre 
nouncements on the topics he is surveying but shows the reader how tht 
conclusions were arrived at by giving him a summary of the resear 
methods and results upon which the conclusions were based. The ap- 
proach leads naturally to a consideration of the validity of the finding 
and keeps the interpretations close to the facts. It serves to war? 
applied psychologist of the danger of making recommendations beyond 
the data at hand and to stimulate industry to support basic psycholog! 
research with the same understanding with which they have underwrite? 
engineering research on their technical problems. 

The content of the book is organized around the two basic problem 

of efficiency and motivation in work activities. “Efficiency” is defi s 
the ratio between input and output, both of the variables being conside | 


Book Reviews 329 


sense; output includes worker satisfaction as well as rate of per- 
e and input includes all adverse effects of the work—energy ex- 
e, effort, fatigue, effects upon health, personal adjustment in 
ју, the worker's time, etc. In contrast to efficiency, in which the 
^m is to increase the output obtained from a given level of effort, 
tion” is concerned with raising the level of effort and thus in- 
the worker's productivity. 
So far as basic methodology is concerned, the chief problem is measur- 
e input variable and Ryan devotes three chapters to the various 
es to measurement of the cost of work. Work activities are 
ified into two major categories: muscular work and sedentary work. 
es of cost in terms of energy expenditure, reduced capacity, meta- 
changes, fatigue tests, long-term production trends, etc. are evalu- 
and the specific limitations of each are pointed out. A final sum- 
ry groups the various measures into three classes: (1) promising indices 
lin the developmental stage, (2) a few established measures which have 
d application, (3) crude indices of limited validity, but which pro- 
? rough solutions until more refined methods become practically useful. 
In contrast to the extended treatment of the basic problem of measure- 
of the cost of work to the worker, the surveys of factors affecting 
lliciency are relatively brief. In line with the author's high standards 
research methodology, selected studies are described and cautious con- 
ions drawn with respect to the influence upon efficiency of noise, tem- 
е, Ventilation, illumination, hours of work, rest periods, and sleep. 
2 brief chapter on work methods presents a critique of motion study 
88 viewed by a psychologist. 
The topics described above account for about half the book and rep- 
nt a rather unified survey of the problem of efficiency. The second 
of the book presents separate discussions of incentives, boredom, rate 
Ing, merit rating and job evaluation, accident control, and industrial 
Maining. In each case, however, the experimental viewpoint is main- 
"ined and the relevance of the findings to efficiency and motivation is 
ted out. Of particular interest are the evaluation of time study from 
Sychological viewpoint and the rather devastating attack on rating 
Ods in common use. The most extensive treatment is of accident 
ol, presumably because of the greater amount of experimental data 
that area, 
| Considering the magnitude and difficulty of the task which the author 
Mor himself, the outcome is to be admired. The nature of the approach 
Recessitated a detailed treatment of selected research rather than a sum- 
IY of all available data. Probably no two authors would agree on 
"Ich studies should be included in such a treatment but this reviewer 


а 


330 Book Reviews 


found discussed most of the basic studies with which he is familiar 
most notable omissions were ће RCA studies on music in industi 
Dartmouth studies on visual fatigue. 

Like other writers, Ryan attempts to clarify some of the basie ¢ 
by definitions of terminology such as efficiency, effort, energy ехрепи 
etc., and to organize the concepts into a logical framework. his | 
the place to evaluate the validity or usefulness of his partie 
constructs. If it stimulates further research in this area it 
served its major purpose. | 
Albert S. Thomps 
Vanderbilt University 


Franziska Bumgarten, Die Psychologie der Menschenbehandlung й 
triebe, 2nd ed., Zurich, Rascherverlag, 1946, pp. 304. 


The book discusses the psychology of inter-personal relationshij 
tween the executive and the subordinate in industry in a mainly 
quantitative and practical manner. Such relationships are now gen 
recognized as a faetor of paramount importance in determining the 
ress of industrial enterprise. ч 

Employers are frequently interested in very general suggestions! 
can be used with all employees in order to produce an increase in e 
or an harmonie atmosphere in industry. However, the author 
that the control of personal relationships in industry can only be 
if based on an understanding of the psychology of the employee. 
worker should be treated as an individual with all his assets, we 
and idiosyncrasies. Only after having studied and understood th 
sonality of his subordinate can a supervisor, foreman, etc. choose the! 
appropriate approach. 

The author presents an extensive study of different types of em 
and employees, describing their approach to their work, to their sup 
subordinates, “and co-workers. She describes in detail and 
thoroughly their respective reactions at the time which she са 
critical moment” of their relationships, when an order is given | 
cuted, when control is exercised or submitted to, when a rep 
pressed or received, when a punishment is given or taken. 

The book should be useful to managers, supervisors, forem 
helping them understand the group of people by whom they 
rounded in their work. Many of the suggestions given in great 
could probably prove very effective in dealing with human m 
intelligently applied by men in executive positions. The а 
strongly aware of the fact that intelligent methods in industrial. 
and guidance cannot always be produced by suggestions, even if 


Book Reviews 331 


carefully followed, but that the use of such practices is strongly dependent 
са the executive’s own emotional make-up. 

We can congratulate ourselves that in this country relations between 
the executive and the subordinate appear to represent a real partnership 
more than in Switzerland, where the book was published; that the power 
of the executive is counterbalanced by a sense of self-respect and dignity 
of the American worker. 

The book is written in a fluent and precise style. The author follows 
her line of thought throughout the book with impeccable logic. 

Michael Joelson 


David Webb Company, 
Edinburg, Indiana 


New Books, Monographs, and Pamphlets | 


Books, monographs, and pamphlets for listing and possible review should be sent to 
Donald G. Paterson, Editor, Department of Psychology, University 
of Minnesota, Minneapolis 14, Minnesota 


| 
Office library of an industrial relations executive. Helen Baker. Prince 
ton: Industrial Relations Section, Princeton University, 1946. Рр 


35. $.50. 


American junior colleges. Second Edition. Jesse P. Bogue, Editor. 
Washington, D. C.: American Council on Education, 1948. Рр. 500. 
$6.50. 

American universities and colleges. Fifth Edition. A. J. Brumbaugh, 
Editor. Washington, D. C.: American Council on Edueation, 1948. 
Pp.1100. $8.00. 

Case histories in. clinical and abnormal psychology. Arthur Burton and 
Robert E. Harris, Editors. New York: Harper and Brothers, 1947. 
Pp.680. $4.00. 

Reading and visual fatigue. Leonard Carmichael and Walter F. Dear- 
born. Boston: Houghton Mifflin Co., 1947. Pp.483. $5.00. 

Hearing and deafness. Hallowell Davis, Editor. New York: Murray 
НШ Books, Ine., 1947. Рр. 496. $5.00. 

Handbook of job facts. Alice Н. Frankel. Chicago: Science Research 
Associates, 1948. Pp. 160. $2.00. 

Guidance testing. Clifford P. Froehlich and Arthur L. Benson. Chicago: 
Science Research Associates, 1948. Pp. 104. $1.00. 

Educational measurement and evaluation. N. L Gage and H. H. Rem 
mers. New York: Harper and Brothers, 1943. Рр. 580. $3.00... 

Personnel and industrial psychology. Edwin E. Ghiselli and Clarenc"the 
Brown. New York: McGraw-Hill Book Co., Inc., 1948. Рр. ext 
$4.50. er 

A trade union analysis of time study. William Gomberg. Chic | 
Science Research Associates, 1948. Pp. 256. $4.25. a 

The contemporary American family. Ernest R. Groves and Gladysur 
Groves. Philadelphia: J. B. Lippincott Co., 1947. Pp. 838. Satity 

Theories of learning, Ernest Hilgard. New York: D. Appleton-Cenal if 
Co., Inc., 1948. Рр.409. $375. ri 

Encyclopedia of vocational guidance. Volume Land IL. Oscar J. Kaming 
Editor. New York: Philosophical Library, 1948. Рр. 1422. 85у 86, 


332 


New Books, Monographs, and Pamphlets 333 


Psychological ailas. David Katz. New York: Philosophical Library, 
1948. Рр. 142. $5.00. 

Psychological warfare. Paul M. A. Linebarger. Washington, D. C.: 
Infantry Journal, 1948. Pp. 259. $3.50. 

The psychology of abnormal people. John J. B. Morgan and George D. 
Lovel. New York: Longmans, Green and Co., Inc., 1948. Рр. 673. 
$4.50. 

A guide to confident living. Norman V. Peale. New York: Prentice-Hall, 
Inc. 1948. Рр. 248. $2.75. 

International directory of opinion and attitude research. Laszlo Radvanyi, 
Editor. Mexico: National University of Mexico, 1948. $6.00 paper, 
$7.00 cloth. 

Mental health in modern society. Thomas A. C. Rennie and Luther E. 
Woodward. New York: The Commonwealth Fund, 1948. Pp. 424. 
$4.00. " 

Logic and scientific methods. Herbert L. Searles. New York: The 
Ronald Press Co., 1948. Pp. 326. $3.50. 

Vocational counseling and placement in the community in relation to labor 
mobility, tenure, and other factors. Pamphlet 5. Carroll L. Shartle. 
New York: Social Science Research Council, 1948. Рр. 18. $.25. 

Psychiatry for the pediatrician. Hale F. Shirley. New York: The Com- 
monwealth Fund, 1948. Pp. 442. $4.50. 

How to develop your executive ability. Daniel Starch. New York: Harper 
and Brothers, 1946. Pp. 267. $3.00. 

Industry and society. William F. Whyte, Editor. New York: McGraw- 
Hill Book Co., Inc., 1946. Рр. 211. $2.50. 

The psychology of teaching. Asahel D. Woodruff. New York: Longmans, 
Green and Co., Inc., 1948. Pp. 278. $3.00. 

Child care, questions and answers. Children’s Welfare Federation of New 
York City. New York: Doubleday and Co., 1948. Pp. 159. $2.00. 

Labor force definition and measurement. New York: Social Science Re- 
Search Council, 1947. Pp. 134. $1.00. ] ( 

The nation’s most prosperous industry. New York: Textile Workers 
Union of America, CIO, 1948. Pp. 24. t \ 

Phortunities for psychologists, psychiatrists, psychiatric social workers. 
Pasadena: Western Personnel Institute, 1948. Pp. 38. $1.00. 

Annual report of the Federal Security Agency. Washington, D. C.: Super- 
Intendent of Documents, U.S. Government Printing Office, 1947. 
Pp. 41. 8,15. 


JOURNAL OF APPLIED PSYCHOLOGY 


PRICE PER РЕСЕ 
YEAR VOLUME AVAILABLE NUMBERS NUMBER у 
MAR JUN SEP DEC 


1917 1 a ye ye "а $1.75 9 
1918 2 ИЗА 4 $1.75 $5.00 
1919 3 154 9:4 $1.75 $5.00 
1920 B E TESS ^4 $1.75 $6.00 
1921 5 CAL i NE $1.75 $5.25 
1922 6 ЗЬ. 4 $1.75 $5.25 
1923 7 гр 4 $1.75 $5.25 
1924 8 Was D4 $1.75 $6.00 
1925 9 Lp 4 $1.75 $6.00 
1926 10 =, =. 4 $1.75 зю 
РЕВ APR JUN AUG OCT DEC 
1927 п ENG 44 85 6 $1.25 $5.00 
1928 12 Кооз а 5 6 $1.25 $6.00 
1929 13 ИРИ 4 5 6 $1.25 $6.00 
1930 14 LIES 9: 4 s 6 $1.25 $6.00 
1931 15 аа ЗЕН И. $1.25 $5.00 
1932 16 3.3 ~ 4 5 6 $1.25 $6.00 
1933 17 ж. МИК. И, 6 $1.25 $6.00 
1934 18 ХИ" а". S 6 $1.25 $6.00 
1935 19 nM. 6 $1.25 $6.00 
1936 20 жита = "А = 6 $1.25 $6.00 
1937 21 9 ie SE S 6 $1.25 $6.00 
1938 22 EUST TE 6 $1.25 $6.00 
1939 23 od Bo з 9 6 $1.25 $6.00 
1940 24 Pause. ek S 6 $1.25 $5.00 
1941 25 ние 5 6 $1.25 $6.00 
1942 26 Doa "5 6 $1.25 $6.00 
1943 27 МАН 75. 6 $1.25 $6.00 
1944 28 Hnc sur СБ 6 $1.25 $6.00 
1945 29 АЛЛА. sS 6 $1.25 $6.00 
1946 30 Мата S 6 $1.25 $6.00 
1947 31 ОА s 6 $1.25 “о 
By subscription, $6.00 $1.25 

List price, Volumes 1 through 31 $177.75 
30% Discount 53.33 
— 

Net price, Volumes 1 through 31 $1244 | 


Information about the Journal of Applied Psychology: only 11 numbers are out of ## 
in the thirty volumes. These are Ad PERMET д i 


Volumes 1 through 10 had four numbers ear; after that there are six numbers M 
year. Number 3 of Volume 24 will be sold eni with an as for several volumes. 


Information about prices: the journal is uniformly priced at $6.00 per volume. When t% 
EE jer volume were issued, the price per пре is $1.73; chen six numbers p Is 


were issued, the price per number is $1.25. For {огей tage, $.25 per volume 5508 ig 
зу Тһе American Psychological Association ires the following discounts on orders f^. 


10% on orders of $ 50.00 and over 
20% on orders of $100.00 and over 
309% on orders of $150.00 and over 


Current subscriptions and orders for back numbers should be addressed to: 


AMERICAN PSYCHOLOGICAL ASSOCIATION, INC. 
1515 Massachusetts Avenue, N. W. 
Washington 5, D. C. 


ournal of Applied Psychology 


Satisfaction With Nursing 


Helen Nahm 
Duke University 


There is growing recognition of the close relationship which exists 
Ween the extent of satisfaction within a professional group and the 
Миу of service which that group renders to society. Within recent 
nursing service administrators, hospital administrators, doctors, 
others have become increasingly concerned about the apparent dis- 
faction of many graduate nurses. Faculty members in schools of 
ing have frequently noted that students, who at the time of admission 
n highly motivated and very enthusiastic about their chosen profes- 
‚ undergo a gradual change, and, by the time of graduation, are too 
quently a dissatisfied and, at times, a dissilusioned group of young 
еп. Faculty members have believed that, if it were possible to ereate 
‘environment in a school of nursing in which the enthusiasm and high 
tivation of students might be preserved, the dissatisfaction after 
aduation might be materially reduced. 


Studies on Satisfaction With Nursing 


are much better satisfied with nursing than those in others. 

tom reactions of students to a number of questionnaire items de- 
ed to discover factors associated with satisfaction or dissatisfaction, 
emed evident that satisfaction was associated with a liking for bed- 


336 Helen Nahm 


side care of patients, and a feeling that the nursing school program had 
been well-planned and that students had had adequate experience in the 
major nursing services. Other factors which were associated with satis 
faction were the kinds of relationships which were established with faculty 
members, head nurses, doctors, and others; provisions which were made 
for the welfare of students; opportunities to help plan work, to use шу 
tive, and to express ideas on hospital divisions; and opportunities to ad- 
vance and to earn. adequate salaries after graduation.  Dissatisfied 
students were more likely to complain of chronic fatigue, and to feel that 
classes were dull and boring. 

When an attempt was made to relate reactions to various question- 
naire items to mean scores on the Nursing Satisfaction Scale it seemed 
evident that, in some of the schools in which many unsatisfactory con- 
ditions were present, students were, strangely enough, highly satisfied. 
In other schools in which many provisions were made for the welfare and 
happiness of students, they were not so well satisfied. Comparisons 
which were made between high and low satisfaction groups in three of 
the schools of nursing also indicated that the factors which are associated 
with satisfaction probably vary from one situation to another. For ex- 
ample, in two of the'schools satisfied groups had more satisfactory rela- 
tionships with head nurses and supervisors, and were better satisfied with 
provisions which were made for student welfare. In the third school 
these factors did not distinguish between high and low groups. In one 
school the dissatisfied group was of significantly higher intellectual ability 
than the satisfied group. In the other two schools the ability level of 
high and low groups was about the same. It seems probable, from these 
findings, that, in each school of nursing, there is a complex set of interre- 
lationships between conditions which are actually present in the environ- 
ment and willingness of students to accept and adjust to such conditions. 


General Plan of Present Study 


During the Spring of 1947 a study was made of the satisfaction of 
three groups! of students who were enrolled in the Duke University Seh 
Nursing. The groups included 52 students who were enrolled in t 
freshman class, 62 students who were enrolled in the junior class, and 7 
students who were enrolled in the senior class, The primary purpose 
the study was to determine whether there were differences among 
groups of students both in the extent of satisfaction with nursing and” 
the factors which are associated with it. 

‘It is customary in schools of nursing which offer three-year programs to ee 


first-year students as freshmen, second-year students as juniors, and third-year student? 
as seniors. 


Satisfaction With Nursing 337 


To measure satisfaction with nursing an adaptation of the Hoppock 
(4) Job Satisfaction Scale was used. This Scale is made up of four sec- 
tions, with seven possible responses to each section. Responses range 
from those indicating a high degree of satisfaction with nursing to those 
indicating a high degree of dissatisfaction. Each section is scored from 
1 to 7 and total scores may, therefore, range from 4 to 28. 

To determine factors associated with satisfaction and dissatisfaction 
students were asked to respond to 87 items designed to measure reactions 
to working and living conditions; relationships which were established 
with faculty members, head nurses, doctors, and patients; organized class 
work and clinical experience; and general provisions which were made for 
the welfare of students. Freshman students were asked to list the three 
things which they liked best about the Duke University School of Nursing. 
All students were asked to list the problems or difficulties which had been 
9f most concern to them since entering nursing training and to give sug- 
gestions for improvement. 

In presenting findings of this study a few explanations seem in order. 
The study on the Duke University School of Nursing students was made 
during the Spring of 1947, a period in which the stresses and tensions in 
hospitals were probably as great as or even greater than at any time dur- 
ing the period of World War II. Furthermore, in the year preceding the 
time the study was made, a number of individuals in key faculty posi- 
tions had resigned, and had been replaced by individuals who were un- 
familiar to junior and senior students in the school. Attitudes of these 
students, therefore, probably reflect both the stresses and tensions of a 
busy post-war period and also the insecurity which students are likely to 
feel when widespread faculty changes are made. í 

Freshman students entered the school of nursing at about the time 
that many of the new faculty members assumed their respective positions, 
and, therefore, accepted these new people without question. At the time 
the study was made the freshman students had been in the school of nurs- 
ing for nine months. Though they had had heavy class work during this 
Period, they had not been required to carry heavy responsibilities on 
hospital divisions, Furthermore, the experiences which they had had in 
the actual care of patients were carefully selected by their instructors and 
Well supervised. 


General Findings of the Study 


Satisfaction scores of the three groups of students of the Duke Uni- 
versity School of Nursing and of the group of 428 students from 12 
Schools of nusing in Minnesota (6) аге givenin Table 1. It seems evident, 
from this table, that the freshman students of the Duke University 


‹ 


338 Helen Nahm 


Table 1 
Satisfaction With Nursing of Students in Schools of Nursing 


" 


Students from 12 Students of the Duke Univer 
Schools of Nursing School of Nursing. — 
oo Per Cent of Total Group™ 
Per Cent of Seniors Juniors 
Total Group N =70 М = 62 
Enthusiastic (24-28) 24 20 18 
Likes it (20-23) 61.6 57 62.5 
Indifferent (16-19) 13 19 18 
Doesn't like it (12-15) 14 3 1.5 
Dislikes it ( 8-11) 0 1 0 
100.0 100.0 100.0 


School of Nursing were much better satisfied with nursing than e 
junior or senior students in the same school, or the senior students | 
Minnesota schools. 


Mean scores and standard deviations of the various groups 
Nursing Satisfaction Scale are given in Table 2. The difference bet! 


Table 2 
Mean Scores and Standard Deviations of Students on the Nursing Satisfaeti 


Number of 
Students Mean D 
Students from 12 schools of 
nursing in Minnesota 428 21.8 
Students of the Duke Uni- 
versity School of Nursing 
Seniors 70 21.0 
Juniors 62 21.6 
Freshmen 52 23.2 


means of Duke junior and senior students is not significant (t = 
The difference between the mean of the Duke freshman students а 
of the combined? junior and senior group (t = 7.91) indicates t 
freshman students were much better satisfied with nursing than w 
more advanced groups. Differences between means indicate thal 
group of 428 senior students from 12 schools of nursing in Minne 
better satisfied than the combined junior and senior group of th 
University School of Nursing (¢ = 3.64), but were not so well 8 
as the Duke freshman group (t = 5.57). : 


M The mean of the combined junior and senior group is 21.3 and the stan 
tion is 2.68. 


Satisfaction With Nursing 339 
Factors Associated With Satisfaction in Nursing 


Almost all students of the Duke University School of Nursing (93 to 
100 per cent) said that they enjoyed bedside care of patients. About 
50 per cent stated that they had opportunities to use their initiative on 
hospital divisions, but only about 20 per cent to help plan work or to ex- 
press their ideas. About 75 per cent felt that ratings on ward work were 
usually fair. Ninety per cent liked the majority of the head nurses, 
teachers, and supervisors, and 75 per cent said that head nurses and 
supervisors usually approved of their work. Eighty to 90 per cent were 
satisfied with the food service and living quarters. About 75 per cent 
felt that they had had adequate care during major illnesses and 60 per 
cent during minor illnesses. 

Percentage differences which are significant at the 1 per cent level 
indicated that junior students, in comparison with seniors, were more 
likely to feel that patients received inadequate care; that doctors de- 
manded too much of nurses; that work on hospital divisions was too 
heavy, and that there was always so much to do that they couldn't do 
Satisfactory work; and that needs of students were subordinated to needs 
of the hospital. A significantly higher proportion of junior than senior 
students complained of chronic fatigue and backache. 

When comparisons were made between the freshman students of the 
Duke University School of Nursing and the combined junior and senior 
group, percentage differences which are significant at the 1 per cent level 
indicate that the freshman students were: 


less likely to feel that patients received inade uate care; Wn 
more кеш in helping patients solve their mental and emotional 
problems; Д , 
more inclined to feel that patients liked them and the care which they gave; 
` less inclined to feel that work on hospital divisions was too heavy; bd 
less likely to feel that needs of students were subordinated to needs of the 
lospital ; " 
less inclined to feel that favoritism was shown toward some students; ixi 
тоге likely to say that head nurses and supervisors were good abou 
Eb them with things they did e understand; 
ikely to enjoy life in a nurses’ residence; a ay 
more inclined to feel that they had reasonable ‘freedom to do as they liked; 
ess likely to say that students had little opportunity to participate in 
making and enforcing rules under which they were governed; Ми: 
тоге likely to say that they frequently had contact with people outside the 
| hospital and nurses’ residence; Бе 
ess inclined to feel that many of their classes were dull and boring; " 
More likely to feel that they had had adequate instruction in personal an 
mental hygiene; Ч 
more likely ДЕ p that they had opportunities to practice good mental 
Ygiene on hospital divisions and in the nurses' residence; с 
more inclined to foe] that they had adequate time for rest and sleep; 


‹ 


340 Helen Nahm 


less likely to complain of chronic fatigue and irritability; 
more likely to feel that the social program was adequate; and 
cepe to feel that the entire nursing school program had bee 


р 


Problems of Students 


‘Problems of Junior and Senior Students. The problem listed 
frequently by junior and senior students than any other was lack of; 
cient time to give adequate саге to patients. This was attri 
heavy assignments and understaffed wards. The stucents also! 
plained of lack of sufficient time for study, recreation, and sleep. ' 
disliked having to attend classes during the time they were оп night d 
Junior students were more concerned than seniors about class work 

А few direct comments of junior and senior students which seen 
to illustrate their major problems are listed as follows: 

Because of my grades and attitudes I was constantly afraid of 

missed, particularly during the first year. 


I don't like being criticized about my deficiencies without being allow 
make any explanation, or being given any assistance in overcoming 
deficiencies. | 


Faculty members have accused me of not being interested in my} 
That's rather silly, I think, because I am, else why should I be here? 


I sometimes wonder whether I am in the right profession. At times! 
discouraged and ready to quit. 


How can we learn to tolerate and understand those with less e peri 
when those with more experience seldom try to understand us. 


We are given adult responsibilities on hospital divisions, yet placed! 
child's basis in the nurses' residence. T 


Head nurses expect us always to seem busy. Wh top to 
ў ‹ у. en you stop 

a patient they think you are loafing. Sometimes patients need 
than any actual nursing care you can give. 


One of the supervisors told a student she was just here to work and ; 
and should not expect anything else. At this point that is the way? 
We are just here to serve the needs of the hospital and nothing else. | 


I too frequently feel insecurity rising up inside me. | 

I used to be healthy. Now I am just plain tired and badly in nee 

vacation. 

Problems of Freshman Students. Freshman students listed а 
problems in connection with work on hospital divisions. About 
cent complained of lack of time for study and lack of knowledge 
study. About 35 per cent felt that they lacked social skills W 
necessary to feel at ease in social situations. Thirteen рег ©! 
sensitive about their personal appearance, and an equal number 
they lacked personal qualities which are needed for success in Pl 


Satisfaction With Nursing i 341 
Suggestions for Improvement 


Suggestions of Junior and Senior Students. Typical suggestions made 
by junior and senior students for improvements in the nursing school pro- 
gram are stated as follows: 

We need shorter and better planned hours, particularly during the second 

year when students have heavy class work. 


More nurses are needed so that patients can be given adequate care. Make 
it possible for students to spend more time with patients. 


More competent head nurses and supervisors are needed; individuals who 
are interested in students and their problems, and who know how to super- 
vise work on hospital divisions. 


Plan for more teaching on the wards where it will really soak in. Give us 
more information about patients, and make it possible for us to attend 
conferences with doctors. 


Improve corrective methods when we make mistakes, If criticisms are 
to be made, talk to the student about them before turning in her record to 
the nursing school office. 


Make courses more complete and interesting. Have more panel discussions 
and better organized, more interesting lectures. 


Place more stress on mental hygiene and good public relations. 
Have more lectures on current topics of the day. 


We need an adviser to help us with our social, onal, and emotional 
problems; someone who knows us individually, ап who can help us under- 
stand our weaknesses. 


We need more freedom and liberty to govern ourselves. Treat us as 

adults, not as children. 

Suggestions of Freshman Students. Most of the suggestions for im- 
provement which were given by freshman students had to do with organ- 
ued class work. They felt that some of their courses should be better 
organized, and that some of their teachers should be better prepared for 
teaching. Of 41 separate comments which were made about experiences 
on hospital divisions, 39 were favorable. The students felt that the ex- 
Perience on hospital divisions was very interesting, very beneficial to 
them; that it gave them a feeling of personal worthwhileness, as well as a 
Very great deal of satisfaction. 


Attitudes of Freshman Students Toward the School of Nursing Program 


In response to the question “What do you like best about the Duke 
University School of Nursing?”, 40 per cent of the students mentioned the 
interested and understanding instructors, supervisors, and head nurses; 

Я рег cent, the satisfaction of working on hospital divisions; 32 per cent, 
е friendly atmosphere; 30 per cent, the attractive residence; and 20 per 
Cent, the well organized and well-taught courses. From 10 to 20 per cent 


[1 


342 Helen Nahm 


| 
mentioned the broad experience which students have on hospital division, 
the opportunity of working with experienced and highly qualified doeton, 
the democratic student organization, and the access to college life. 


Summary and Conclusions 


Findings of this study suggest that there is a sharp decrease in satis 
faction with nursing as students progress from the freshman to the јави 
year. At the end of а period of nine months in the school of nursing, the 
freshman students were still a highly motivated and enthusiastic group. 
Junior and senior students, on the other hand, showed many evidences o 
tension and frustration. Comparisons between the two latter group: 
indicated that junior students have more problems in connection with 
class work and with work on hospital divisions than do seniors. The 
many favorable comments which the freshman students made about the 
entire school of nursing program seem in marked contrast to the rather 
unfavorable comments made by the more advanced groups. It seems 
probable, from this finding, that the unfavorable comments are, in path 
a reaction against the ever-increasing responsibilities which students in 
nursing are expected to assume as they progress through the school о! 
nursing. : 

Suggestions which the students made for improvement in the school 
of nursing program seem, on the whole, to be thoughtful and reasonable 
These suggestions indicate that, if more frequent attempts were made t0 
determine how students feel and what they believe, the result would be af 
value to both the individual student and to the school of nursing. 

As stated in a foregoing section of this paper, a number of faculty 
changes had been made at the Duke University School of Nursing in the 
year preceding the time the study was made. Attitudes and comments 
Junior and senior students, in comparison with those of freshmen, give 
some indication of the effect upon students of widespread faculty changes 

In presenting findings of this study it is recognized that a follow? 
study on one group of students as that group progresses through 
School of nursing would be of more value than comparisons among t 
Separate groups. Plans have been made to do a follow-up study 0n 
freshman group. Findings of this project suggest, however, that the 
source of the dissatisfaction and disillusionment of the graduate nurse may 
be traced, at least to a very great extent, to the experiences which t М 
nurse has had as a student іп a school of nursing. It seems probable tha 
the situation in nursing will not improve markedly until it becomes F 
sible for students to continue to feel about nursing much as the Du 
University School of Nursing freshmen did at the end of a period of nin? 
months in the school of nursing. To achieve this result it woul 


Satisfaction With Nursing 343 


jes which they carry at the present time, and to make it possible for 

[о be students of nursing rather than primarily hospital workers. 
т, the problem of staffing hospitals at this period is a crucial one 
n only be solved through the combined efforts of the hospital ad- 
rs, doctors, the schools of nursing themselves, and the public. 
те is, at present, considerable interest in hospitals and schools of 

g, but, as yet, much too little real understanding of the problems 
einstitutionsface. If studies on the attitudes and satisfactions 
hts in nursing can contribute to an understanding of these prob- 
rhaps the support which is needed to solve them will follow in 


Utilizing Findings of this Study 


When studies have been made within a school of nursing it is essential 
findings be utilized as soon as possible. Reports of this study 

made available to a number of individuals in the Duke Uni- 
School of Nursing, the Hospital, and the Medical School who are 
‘concerned with the welfare of students in nursing. Comments 
se individuals have indicated that they are interested in the findings 
Study, and anxious to make changes which are needed to help 
S achieve greater satisfaction in nursing. The students who 
ated in the study have been interested in the findings. It is 
that the opportunity of expressing opinions and of making sug- 
for improvement, in itself, has had a therapeutic effect. In 
, Students have been stimulated to think seriously about their 
1 attitudes and personal characteristics, and to realize that they too 
vital stake in the future welfare of the nursing profession. 


d January 6, 1948. 


References 


can Journal of Nursing. The General Staff Nurse, A study of the general staff 

_ nurse in eighteen states. Amer. J. Nurs., 1988, 38, 1221-1227. 7 

ireau of Labor Statistics, U. S. Dept. of Labor. The economic status of the nursing 
ion. Amer. J. Nurs., 1947, 47, 456-462. | 

ittee on the Grading of Nursing Schools. Nurses, patients, and pocketbooks. 

York: National League of Nursing Education, 1928. 

Robert. Job satisfaction. New York: Harpers, 1935. 

. Factors associated with job'satisfaction in nursing. Amer. J. Nurs., 

0, 40, 1389-1392. 


How Better Personnel Selection Can Reduce Factory Co 


William James Giese 
William James Giese, Ph.D., and Associates, Chicago, Illinois 


А. The primary purpose of this analysis is to determine whether Ø 
psychological aids for the selection and placement of personnel wil 
more money than the installation and maintenance of such aids will; 
B. If such aids can affect significant savings, the secondary pur 
is to outline in general terms the scope of a psychological testing pro 
which will meet the needs of the *A" Company. 
C. A further supplementary purpose is to make any observatio 
comments which may be valuable that are indicated by the study і 
though they do not bear directly upon A or B above. 6 


Method 


A. АП of the jobs in the non-exempt category were studied by 
psychologist for the purpose of grouping the jobs into general с 
families (i.e. groups of jobs in which the capacities required to learn 
work are generally similar). 


1. This step was accomplished through а study of the job descripti 

2. All of the jobs in the shop job families were checked with the Time бі 
and Routing Supervisor. 

3. АП the jobs in the office job families were checked with the Per 
Director and the Assistant to the Controller. 

4. Some minor changes were made in a few of the groupings in th 
job families as a result of these checks. n 

5. With some of the shop jobs the work operations were observed jo 


1. Only those employees who were on piece work 70% or more of the 
for pay periods 1-7 and 8-14 (1947) were included. 


* This is the body of a longer report entitled “А Psychological Analysis of Pe 
Requirements from the Standpoint of the Practicability of Psychological 
Selection and Placement of Personnel," submitted to the major executives of Comp 
For the purposes of publication, the number of exhibits had to be reduced 8 
detailed appendices eliminated. 


344 


Better Personnel Selection 345 


C. General burden costs and retainer charges, considered in con- 
section with the spread of productivity among employees and break even 
points for the selective aids, were computed for various increases in better 
selection and placement of employees for different selection ratios. 


1. A selection ratio is the number of persons hired to the number con- 
sidered. 


D. A study of the terminations for the most recent twelve month 
period was made for each skill family. 


1. These percentages were used in relationship with other data in deter- 
mining the practicability of psychological selection and t aids. 


E. Such factors as merit ratings, spoilage, reworks, accidents, ete. 
were not used in this study because: 


1. Such data were not readily available. 
2. There were sufficient data for the purposes of this study without them. 


Results 


A. As a result of studying the job descriptions and in some cases ob- 
serving the work being performed, all of the non-exempt jobs were classi- 
fied into 18 general skill groups on the basis of the capacities necessary 
to learn the work. 


Е ано are 7 = groups for the shop ioe 
. There are 9 skill groups for the ofice Jobs. d ` 
3. For both the office d shop there is а miscellaneous classification for 
those jobs which did require certain capacities, but because the number 
of employees doing the work was small, it was impractical to set up & 
separate skill family. { р 4 
4. di both the office Vet shop there is an retra classification for those 
jobs which require very little capacity tolearn. . 
5. Table 1 пали Av the name of the skill family, number of persons 
in each family, and the number of terminations in each family. 
а. ре : 2 lists for each of Lr. shop skill families: 
1. The selection requirements. |. . 
Ji. The job titles included in the skill family. E 
iii. The number of persons in each job as of May 1, 1 "m 
b. Appendix II lists the same information for the office jobs. 


B. There were sufficient data available to analyze the productiveness 


of four of the 18 skill families. 
C. The percentage of anticipated 


_ ‘The difference between the minimum guaranteed base 
plece rate, 


earned rate is a reliable measure 
rate and amount earned on 


* The original ге i tails in Appendices I and II. 
port contained these details in Appen 
* The anticipated earned Tate is the amount of money an employee who has com- 
to the learner stage should earn on standard piece work. This is roughly equivalent 
about 70-80% of leveled standard output. 


346 William James Giese 


Table 1 


Terminations by Skill Families for the Period of May 1946 to 
May 1947 (52 week period) 


T ——————————————— 


Number Termins 


Number of tions as 
of Termina- Per Cent 
Code Skill Family Persons tions of Total 
ee : Tad NN —— 
MC Craftsmen mechanics 114 24 21% 
М1 High mechanical 152 38 25% 
M2 Average mechanical 363 124 34% 
мз Low average mechanical 213 334 157% 
Molder power squeezer 44 150 341% 
Low average mechanical (excl. MPSq.) 169 184 109% 
MDL Light to medium manual dexterity 176 261 148% 
Inspectors castings E and F 52 138 265% 
MDL (excluding insp. cast. Eand F) 124 123 99% 
Crane Overhead and crawler crane operators 21 11 52% 
8C Shop clerical 149 113 76% 
о Unskilled (shop and office) 335 434 127% 
ос General clerical 156 88 56% 
OC-Comp Clerical—computational 61 28 46% 
ост Clerical—typing 10 8 80% 
OComp ^ Computational 29 11 38% 
OComp-T Computational—typing 12 6 50% 
OSec Secretarial 37 29 78% 
OA Administrative 46 11 2476 
OET&D Engineering techniciansanddraftsmen 61 12 20% 
ST Special trainees 79 19 247 


Mise Miscellaneous (shop 36; office 55) 91 57 63% 
An o 7 o XM 


upon which to base other computations such as costs, validation of various 
selection techniques, ete. 


1. The correlations and other data appear in Table 2. 


D. There is à wide variation in the productivity of people in each skill 
family. This is shown by the sigma figures in Table 2; this fact is graphi- 
cally illustrated in Figure 1 which presents the average percentage of antit- 


ciple D rate of each individual studied for the first 14 pay periods 
0 А 


1. The highest producer turns out from 21 to 3.7 times more work than 
the lowest producer. 


2. The upper 327% of producers turn out from 1.4 to 1} times more a 
than the lower 32% of producers. t 
3. There are some small differences between the four groups in the вто 
of total spread of productivity, but it is so small that it із not significan 


E. It is interesting to note (Table 2 and shown graphically on Figu 6: 
1) that mean productiveness varies fron 107% to 127% when the pr 


E Better Personnel Selection 347 


p 


Table 2 
Reliability and Related Facts Concerning the Percentage of Anticipated Earned 
» Rate аз Meaningful Data for the Purposes of this Study 


Skill Family 
M1 M2 M3 MDL 
63 125 81 65 
.985 .92 88 88 
99 96 94 м 


1425 — 192424 107224 107427 
20+ 1.8 24+15 22 + 1.75 22 + 1.95 
' 2 46 18 18 

= (ће number of persons in the study. 

Pearson Product-Moment Coefficient of correlation for pay periods 1-7 vs. 


в estimated correlation for pay periods 1-14 vs. 15-28. 

the arithmetical mean. A 

‘a measure of variability—the range of the middle 6870. — 

the plus and minus to be kept in mind in regard to one individual's percentage 

anticipated earned rate. 

tiveness of individuals is grouped according to the skill family in which 

belong. 

1. Most of these differences are statistically significant. — . Y. 

`2. This difference in mean productivity for the different skill families ma 

indicate some general differences in “looseness” or “tightness” of stand- 

_ ards as established by the time study department. Training in lev 

. and observation for observer-analysts is indicated. 

ЈЕ. Figure 2 graphically illustrates the differences in productivity of 
grap y 


dividuals within departments and compares the differences of produc- 


between departments as a whole. 
bout as great as it was. 


ра Ee the variability of productive i | abane ae: 
- when studied by skill groups wi e fol у i 

pe Departments 3,4, An 17 present ец less pread of posu 
ness of employees, probably due to, the type о Mig 
i. The Punch Press Department 18 certainly a case о! 


. Work, ; iability, the ratio of high to 
ges past al EEG ы нше o 
_ а. These differences are significantly greater than those between the skill 
3 E» А in department 4 have а higher productivity than most 
E | ne min ператва ШАН might be examined by the time study 
AM ements not only sow the seed for employees’ dissatisfac- 
— . tions but also negate the results of job evaluation. 


348 William James Giese 


1 
Those Jobs Which REQuine А High DEGREE OF MECHANICAL CAPACITY 


ISP эм өй эй шї ao 
PERCENTAGE OF ANTICIPATED EAQNED RATE 


3 
THOSE Jobs Миси Requine Low AVERAGE MECHANICAL CAPACITY 


m 
"tace ATAGE + or ANTICIPATED PAGAED ute 


4 
Tuose Jons Миси Фтаџие Шан Average DESRIMINATIVE DEXTERITY (VARIABLE PATTERN) 


| 
T мяр J RERRPE | 
PRACUATAGS € or ПРУ сакав mate | 


А comparison of variabilities of productivity of persons by skill families- 


Better Personnel Selection 349 


DEDARTMENT Э (MACHINE SHOP PLANT 1) 


ЩЩ БЕО А А 


-~ ма 
pracentaae ов ANTICIPATES taanto RATE 


DEDAGTMENT 4 (PLANT 3 SHOP) 


ЯЯ 


we me по ш 
РЕАСЕМТАДЕ OF ANTICIPATED FARMED RATE 


ЧАБЫ 


DEPARTMENT 5 (MACHINE SHOP PLANT 5) 


won m uit 
PCLCEATAQE or RART дане RATE 


DEPARTMENT 17 (PUNCH patss) 


pasar, ОЈ 


vi ARMED RATE 
peacentase or petit rer wa 


Fia.2. А comparison of variabilities of productivity of persons by departments. 


350 William James Giese 


G. Table 1 presents a summary of a study of terminations by 
groups. 


1. In general, the higher terminations rates are found in the low 
skilled groups. ^ 
а. Working conditions, wages, and related causes account for moi 
the terminations in these босов. à 
b. In the higher skill groups in the office, the rensons for termina 
tend to be personal—i.e., marriage, husband moves, household 
children, etc. 
c. In the lower skill groups in the office jobs, the reasons for terminai 
are working conditions, wages, and related causes. К. 
d. The data for statements a, b, and с are not presented in table f 
.but they are available among the work sheets. ч 
With the present termination rates, any small improvement in 
and placement of personnel with even а large selection ratio 
relationship between the selective techniques and the emplo 
would produce results which should be noticed in from 3 to 6 m 
Those areas with very high turnover should be separately stud 
the aim of reducing those factors which must be intrinsic in the wot 
it is now set up. 


H. The relationship between the length of service of the emp 
and productiveness is zero. і 

I. The estimated burden charges against wages paid on piece worl 
$2,043,280. ) 

1. The calculations for the above are among the work sheets. 


.. J. Since burden charges remain relatively constant as the produc 
ness of people on piece work increases, Table 3 shows the sav 


Table 3 Ја 

Savings in Burden Charges as а Result of Improved Selection and Placement of 
x Personnel for Different Selection Ratios 

(A relationship between the psychological selection aids and productivity is аз 
to be a correlation of .40) 


Фә 


t£ 


Per Cent Burden Burden 
Selection а балаада E ud 
electio : 100% 72% 

in Product. Term. Term. 

i 11.2 $228,847 $164,770 

2 9.2 187,982 135,947 

3 7.6 155,280 111,808 

4 64 130,770 94,154 

8 42 85,818 61,789 

7 3.2 65,385 47,077 

8 2.2 44,952 32,365 

9 13 24,654 17,751 

* Based on the fiscal year of 1947, and estimated from the charges during. 


25 weeks. ES 
en 


Better Personnel Selection 351 


Table 4 


Reductions of Retainer Payments as a Result of Improved Selection and Placement of 
Personnel for Different Selection Ratios 
(A relationship between the psychological selection aids and productivity is assumed 
to be a correlation of .40) 


Per Cent Ра t Ра t 
Term. 


Selection in 100% 
Капо Retainers Term. 
1 90 $30,143 $21,703 
2 80 26,794 19,292 
3 70 23,444 16,880 
1 60 20,095 14,468 
5 52 17,416 12,540 
6 44 14,736 10,610 
7 36 12,057 8,081 
8 28 9,378 6,752 
9 20 6,698 4,823 


* Based on total payments for 1947 estimated from payments for the first 4 months. 


burden charges for different selection ratios assuming a relatively low 
relationship between the selection and placement aids and productivity. 
1, Such savings are predicated on a continued need for high production of 
the entire plant. 
К. The reductions possible in retainer payments for various selection 
Tatios are presented in Table 4. ) 
1. The savings due to reductions in retainer payments do not depend upon 


continued high production in the entire plant. А 
2. Since the realen payments represent only a small fraction over one 


per cent of the general burden с! savings effected on retainer pay- 
ments are mi. and above the savings made on the general burden 
charges, 

Table 5 


Estimated Cost of Installing and Maintaining a Sound Program of Psychological 
Aids for the Selection and Placement of Personnel 
(2000 persons per year assumed volume) 


Item First Year Second Year 
Consultants’ fees $4000 $1000 
Psychometrist’s salary ; 3000 / 3500 
ests, supplies, ete. 1000 ИО 
$8750 $6250 


Total 
coc NN 000 o0 ed eee 


352 William James Giese 


L. A breakdown and summary of the cost of installing and maintain- 
ing a sound program of psycholoigeal aids for the selection and placement 
of personnel are presented in Table 5. 


Interpretations and Recommendations 


A. As soon as the labor market will permit a rejection of 1 out of 10 
of the applicants for employment, psychological testing and other aids 
can result in а demonstratable saving of at least $12,000. 


1. These aids will not only be valuable for those employees on piece work 
but also for those employees on day work or salary. 

а. Although the results of selecting these persons with the use of рвусћо- 
logical aids do not lend themselves readily to cost and savings analysis, 
the benefits from them are none the less real and sizeable from a 
financial standpoint. 

b. Improved selection from the standpoint of the capacities of indi- 
viduals to learn and do the work for those jobs on day work or Be 
can result in increased productivity and proficiency, but it depen 
to а large extent on the skill and effectiveness of the foreman or super- 
visor to see to it that these capacities are realized. 

2. Persons selected with the help of psychological aids not only are capable 
of a higher level of production and proficiency but they also learn their 
jobs more rapidly. 

а. Although no cost data on the savings due to shorter learning or 
“breaking in” time are included in this study, such savings, in other 
instances, have been found to more than pay for the entire program. © 

3. An increase in the number of employees placed in work which matches 
their capacities better should result in a decrease in turnover. ч 
а. Persons in work which suits their capacities tend to find more satis- 

faction in their work. 

4. The above (i.e., statement, A) is based on the expectation of finding 8 
validity coefficient of .40 between the psychological aids and produt 
tiveness. 

a. It is probable that a higher coefficient will be obtained. 

b. A validity coefficient of .50 (entirely possible) would increase the 
savings discussed in this study by 55%. : 

с. A validity coefficient of .60 (fairly possible) would double the savings 
discussed in this study. 


B. It is recommended that a sound program of psychological testing 
of applicants for employment be installed. 


1. The present tight labor market will provide an excellent opportunity Е 
give the psychological testing procedures a test гип. for 
a. At the present termination rate sufficient data can be gathered 

most of the larger skill families within 3 to 9 months. l 
b. This will make it possible to test and check the procedures thoroug ith 
i. Thus, when the labor market eases, tables based on experience fa 
previously hired employees will allow the most efficient use ue 
more favorable selection ratio. 


Better Personnel Selection 353 


C. It is recommended that a psychological testing program for aiding 
in the selection and placement of new personnel of the following scope 
be installed: 


1. 


4. 


a фра except those for jobs in the unskilled group be tested as 

indicated. 

a. Appendices I and II list the capacities and/or proficiencies to be 
measured as well as the tentative standards for each skill family. 


. A psychometrist be employed to administer the tests and to do related 


work. 
a. This would be about a half to two-thirds time job when processing 
2000 applicants a year. 


. A system of follow-up must be an integral part of the program. 
a 


. Some measure—probably a good merit rating—is needed for those 

employees not mainly on piece work. ў 

b. If at all practicable, such data as spoilage, reworks, accidents, ete., 
should be part of the follow-up against which to measure the effective- 
ness of the psychological selection aids. у 

c. Productivity should be used to measure the effectiveness of the 
program. - 4 

d. This follow-up is part of the psychometrist’s job. 

The psychologist will engineer the installation of the program and will 

direct the follow-up work. 


Received December 16, 1947. 


Job Evaluation Simplified: The Utility of the Occupational 
Characteristics Check List * 


Roger M. Bellows and M. Frances Estep 


Department of Personnel Methods, School of Business Administration, 
Wayne University 


Job evaluation can provide a valuable system of known dependability 
for agreements between workers and management on the delicate matter 
of employee pay rate schedules. That the present hit-or-miss systems 
of job evaluation are of value is suggested by their widespread use. How- 
ever, nearly all who have set up such systems will agree that much devel- 
opment and appraisal of methods is needed. Appraisal of the utility of 
job rating devices is of considerable significance looking toward improve- 
ment of job evaluation. 

Jay Otis conducted an unpublished study in which he found that a 
method for scoring the Occupational Characteristics Check List did not 
yield scores showing any significant relationship to evaluated points ob- 
tained from job evaluation. In the present paper further examination 18 
made of the usefulness of the Occupational Characteristics Check List 
(OCCL)! in job evaluation. 

The OCCL was developed in 1935 at the Baltimore Center of the 
Occupational Research Program. It is a form for estimating what 
amounts of 47 or more traits or abilities are needed by the worker to do 
the job. A forerunner of the check list was developed by Viteles,” 
which he called a job psychograph. The OCCL was used by the Worker 
Analysis Section of the Occupational Research Program in developing job 


T1 authors express appreciation to Mr. I. W. Winkelman, to the controller, у 
personnel director, and the department heads of his organization, and especially 10 
Miss. Eleanor Yunis who assembled the job descriptions and specifications used n 
training the committee members, and to the members of the job evaluation comm! 
for their thoughtfulness in rating the jobs. isti 

"Тһе Worker Characteristics Form (now called the Occupational Characteristic? 
Check List) is shown, and its uses described in William H. Stead, Carroll L. Shar P 
and Associates, Occupational counseling techniques. New York: American Book b 
pany, 1940, pp. 175-183. The same check list is discussed in Dale Yoder, P d 
management and industrial relations, New York: Prentice-Hall, Inc., 1942, рр. 103-105 
and also in Jay L. Otis and Richard H. Leukart, Job evaluation, New York: Preni 
Hall, Inc., 1948, pp. 84-88. y 

? Morris S. Viteles, Industrial psychology. : W. W. Norton and Сори?! 
еу: New York: W. W 


354 


Job Evaluation Simplified 355 


families and in the Job Analysis Section of the Program in connection 
with job analysis of somewhat more than 10,000 occupations. It is a 
common technique in job study, but it probably has not been used very 
much in job evaluatión. 

The present study describes an experiment in the evaluation of main 
office jobs in a women’s specialty store chain. The members of the job 
evaluation committee were trained to use job descriptions, job specifica- 
tions, and the OCCL to evaluate these jobs in terms of a simple job evalu- 
ation system. 


The Job Analysis and Evaluation 


In а women's specialty store chain organization а program for job 
analysis and evaluation of 53 main office jobs was undertaken, for the 
purpose of agreeing upon fair and equitable salary rate ranges. The plan 
was presented to the employees through an employee memorandum set- 
ting forth the objectives of the program. It was also pointed out that a 
recent employee opinion survey had suggested the use of job analysis and 
job evaluation. The memorandum is shown below: 


To ALL EMPLOYEES: 


The company will begin in the near future a program of job analysis and 
evaluation of Main Office jobs, the purpose of which is to insure fair and 
equitable salary rate ranges. Job analysis and evaluation is a device for estab- 
lishing and maintaining rate ranges and is widely used among progressive retail 
firms today. Numerous responses in the Employee Opinion Survey in which 
you participated suggested the use of job analysis and evaluation. 

his program is intended to accomplish the following: А 

1. Provide a complete record of the content of all jobs and their require- 

rd 80 аз to assist the company in recruiting and training new employees for 
pecifie jobs; Ё Е M: 
E Provide a clear picture of the lines of progression within each department 
etween departments; UMS Н а 

· Set up a eb relationship between all jobs in the office in terms of job 
knowledge, res »onsibilities, mental and skill requirements, working conditions, 
ete, associated with performance of work. Sal ar ranges set up as the result 
of the job evaluation will recognize differences in the above-mentioned require- 
ments. However, no individual earnings will be reduced as the result of this study; 

4. Provide the framework for a merit rating plan which will assure periodic 
and fair appraisal of the individual performances of all employees. a 

A committ tablished which will carry on the work of jo 
Analysis and Bebes the guidance of an outside specialist experienced 
їп this work. The committee will consist of the following members: 1 
- Permanent 1 member (selected by vote of employees) | 
- One other employee ЕНИ the particular department under review 
Permanent department head member у 
The department head from the department under review 
Job analyst member 

ersonnel department member 
op management member 
Technical advisory member. 


00 Мо ©л > со о — E. 


356 Roger M. Bellows and M. Frances Estep 


You will note that the committee is representative of the various levels of 
responsibility. 

The job analyst will prepare analyses of all jobs after talking with the 
department head and the employee. The committee will analyze these jobs 
with the participation of the employee members. After the jobs have been 
analyzed and approved by the committee, they will be evaluated and grouped 
into rate ranges by the committee. 

Job analysis and evaluation is not а speed-up or efficiency development 
procedure or a money-saving device. It will be to the mutual benefit of the 
company and its employees to establish a systematic process of maintaining 
MAY ne ranges in proper relationship to each other and to the requirements 
of the job. 

Since each of you will aid in the job analysis phase of the program, your 
whole-hearted cooperation is urgently requested. d | 

igne 
Company President 
THE COMMITTEE 
Committee signatures 


The next step in the program was indoctrination of the committee mem- 
bers. The general purposes and specific objectives of job analysis and 
evaluation were discussed. The duties and functions of the committee 
were clarified. In subsequent training sessions the use of the job analysis 
form was illustrated, including the OCCL, for estimating what amounts 
of 47 or more traits or abilities are needed by the worker to do the job. 

The job analyst then prepared a few job analyses which were brought 
before the committee for review, revision, and approval. Job analyses for 
the remainder of the jobs were prepared in accordance with the modified 
procedure growing out of the committee’s discussion. About one meeting 
for every three job analysis schedules was required. The OCCL was 
discussed in detail for each job. Final approval by the committee of the 
content of the job analyses completed this phase of the program. Each 
committee member spent approximately 64 hours during the 33 com- 
mittee meetings in study and discussion of the jobs; each may be said to 
be quite familiar with the descriptions, specifications, and characteristics 
required in the 53 jobs before the job evaluation phase of the program Was 
undertaken. 

The job evaluation phase of the program began with a discussion of 
suggested method and procedure. The committee then discussed ]0 
factors to изе for evaluating the jobs. It was agreed to evaluate the Jo 
in terms of two factors: responsibility and training and experience: 
(Commonly-used factors such as working conditions and job haz s 
were omitted since the jobs under consideration were deemed the same n 


such characteristics.) 
Definition of the Two Factors 


The F actor of Responsibility—as indicated in the job descriptions: 
1. Responsibility for employee relations; 2. Responsibility for public ý 


Job Evaluation Simplified 357 


tions; 3. Responsibility for handling money; 4. Responsibility for care of 
merchandise, material, and equipment; 5. Responsibility for records 
(accuracy); 6. Responsibility for confidential information; 7. Responsi- 
bility for number of employees supervised, and type of supervision given; 
8. Responsibility for functioning in the absence of supervision; and 
9. Responsibility for store controls and store personnel relations. 

The Factor of Training and Experience—as indicated in the job specifi- 
cations: 1. Experience required; 2. Minimum training required on the job 
to reach normal production; 3. Technical or vocational training required; 
and 4, Formal education required. 

This small number of factors seemed appropriate in view of the study 
by Lawshe and Wilson? of job evaluation systems using only three or four 
factors. They say, “The final job rank seems to be determined by judg- 
ments on a limited number of factors, regardless of the particular type of 
procedure or the number of point scales through which the raters arrive 
at the final ratings of the job.” Results seem to be as defensible as those 
of the more elaborate systems utilizing a greater number of factors. 

The actual evaluation was done by sorting a pack of cards, with a job 
title on each card, into ranks from high to low for one factor at a time. 
Evaluation was made first for the training and experience factor and 
then for the responsibility factor. The rating judgments were made 
completely independently by each of the five judges or raters who were 
members of the committee. There was no discussion of the jobs after 
the evaluation was begun. However, each of the raters had available to 
them the job descriptions, specifications, and OCCL’s for reference 
throughout the evaluation. ; 

After the ratings had been completed, the total number of points for 
each job was determined by adding the rank points for both factors of 
all five raters (i.e., 2 factors X 5 raters X 1 to 53 rank points). The pos- 
sible spread of total rank points used for the 53 jobs could range from a low 
of 10 points to a high of 530 points. The actual spread was 18 to 523 
points, | 

The data were examined for reliability (Spearman-Brown, split half У 
N = 53). "The estimated reliability index Гог the training and experience 
factor was .97; for the responsibility factor, .95. When these factors 
Were combined, in terms of unit weights, the estimated reliability cor- 
relation was .97. The correlation between the average rank on the two 
factors was .96 + .01 (S.E.. This might suggest that only one factor 
would be needed since virtually all of the variance contained in the 
training and experience factor is also present in the responsibility factor. 


* C. Н. Lawshe, Jr., and R. F. Wilson. Studies in job evaluation. 6. The reliability 


of two point rating systems. J. appl. Psychol., 1947, 31, 355-365. 


:58 Roger M. Bellows and M. Frances Estep 


Occupational Characteristics Check List Data 


The OCCL was scored in a simpler manner. Each trait or ability re- 
quired of workers had been rated by the job analyst and discussed, modi- 
fied, and agreed upon by the committee as: 


A. Very high degree of the characteristic required in some element of 
the job. 

B. Above average amount of the characteristic required, either in num- 
erous elements of the job or in the major or most skilled element. 

C. Medium to very low degree of the characteristic required in some 
element or elements of the job. 


For “A” amount of the trait, a score of 3 was given; for “B”, a score of 
2; and for “С”, a score of 1. The summary of points obtained in this 
manner was the occupational characteristics score used in the computa- 
tions reported below. The range of scores was from 10 to 62. The 
raters had no knowledge that this score was to be derived from their 
judgments made during the committee meetings on the study and ap- 
proval of the job analyses or during the job evaluation meetings. 

"The following double-entry table shows the relationship obtained be- 
tween the total evaluated points and this occupational characteristics 
check list score (Table 1). 


Table 1 
The Relationship Between Total Evaluated Points and the Occupational 
Characteristies Check List Score 
Note: r = 74 + .06 (S.E.) 
High 


Lov 1 2 3 4 5 High 
Total Evaluated Points 
The Pearson coefficient of correlation between these two variables was 
74, +.06 (S.E.). The distributions for both variables were normalized 
nto an approximate 10-20-40-20-10 per cent distribution, for purposes 
f presenting the data in the double-entry table, Table 1. The Pearson 
efficient of correlation obtained between these two variables without 
кога пр the distribution was .79, +.05 (S.E.). A rather high rela- 


Job Evaluation Simplified 359 


tionship was expected since the raters had been trained in the use of the 
OCCL. The judgments were made by people familiar with a common 
body of information about the job content and requirements and specifi- 
cations of the jobs. The considerable relationship found to exist be- 
tween OCCL scores and total evaluated points is suggestive. It is sug- 
gestive of simpler and perhaps better, more valid ways of evaluating jobs. 

If scores derived from the OCCL or an improved version of it could 
eventually be used immediately for evaluation instead of evaluated points, 
the job evaluation phase of the program as conducted in this study could 
be entirely eliminated. This would have resulted in a saving of some 12 
committee hours or 60 individual hours, plus time for planning, discussion, 
and analysis performed outside the committee meetings. 

Development and evaluation by validity studies may in the future 
yield job characteristic check list systems of considerable dependability 
when used with well-trained committee members. Criteria for such 
validity studies may well take the form of median salaries earned for the 
same occupation in the community. If so, much uniformity in use of job 
titles as well as more adequate job descriptions and specifications through- 
out the industrial and business community is needed. 


Conclusion 


Results show that the Occupational Characteristics Check List had 
some utility in job evaluation when the members of the job evaluation 
committee were carefully trained. It provided a rough measure of the 
value of the jobs in a considerably shorter time than was required in job 
evaluating, as conducted with a simplified job evaluation system. The 
OCCL scores were found to correlate .74, +.06 (S.E.) with total evaluated 
points for a population of 53 jobs. Further development and appraisal of 
the check list system of evaluating jobs seem indicated. 


Received May 4, 1948. 
Early publication. 


Fakability of the Strong Interest Blank and the 
Kuder Preference Record * 


Howard P. Longstaff 
University of Minnesota 


Interests have long been considered one of the important factors in 
vocational adjustment. Two interest tests have become nationally prom- 
inent, the Strong Interest Blank and the Kuder Preference Record. 
Since both of these tests depend upon the subject’s statement oí his likes, 
dislikes, and indifferences, a crucial question is how susceptible these 
tests are to faking. Some evidence exists that faking is possible on the 
Strong Interest Blank (1, 2, 5, 7). There has been a feeling on the part 
of vocational psychologists, however, that even though some faking is 
possible on the Strong, it is probably less susceptible to malingering than 
the Kuder (4). 

The purpose of this study was to explore the fakability of both in- 
struments, and to make a comparison between them as to which was the 
more fakable. Since the tests are not scored in the same manner nor 
Scores reported in the same terms, exact comparisons are difficult, but 
rough comparisons are possible. The subjects were 59 students, 24 
women and 35 men on the Strong and 22 women and 37 men on the Kuder 
in an evening Extension Division class in Vocational Development and 
Personnel Psychology at the University of Minnesota. These subjects 
were mature individuals, most of whom were employed. They are 
probably representative of the more intellectual type of person one would 
meet in the industrial employment office. In other words, they are the 
type of person who would be given psychological tests in actual industrial 
selection offices. 

The experimental procedure was as follows: The subjects first took 
the Strong Interest Blank and the Kuder Preference Record as a part of 
a battery of tests given in the laboratory part of the above mentioned 
course. They were instructed to be as frank and honest as possible as the 
results would be used to help them in evaluating their vocational choices. 
After they had taken the tests under these conditions, it was then pointed 
out to them that part of the value of a psychological test, to be used in 
selecting employees, is its imperviousness to malingering and here was а 

* This study was made possible by a grant-in-aid from the Graduate School of the 
University of Minnesota. 
360 


Fakability of Strong Interest Blank 361 


chance for them to discover how well they individually could fake the 
results on the two measuring devices as well as helping to discover the 
general fakability of these tests. 

The men's form of the Strong blank was used for both sexes but the 
Kuder percentiles were based on the women's norms for the women and 
the men's norms for the men. 

The subjects were instructed to try to lower their scores on the com- 
putational, persuasive, ‘social service, and clerical divisions of the Kuder 
and the accountant, life insurance salesman, real estate salesman, personnel 
director, and office man divisions of the Strong blank. Similarly, they 
were to try to raise their scores on the mechanical, computational, 
scientific, artistic, literary, and musical divisions of the Kuder and car- 
penter, mathematician, engineer, physicist, chemist, artist, author- 
journalist, and musician parts of the Strong blank. The various groups 
to be faked were written on the blackboard and the direction of the faking 
indicated. Thus, the subject would look at a test item and try to 
answer it so it would boost the “fake-up groups” and depress the “fake- 
down" groups. 

This was a complicated faking procedure as all the above mentioned 
interest groups had to be faked simultaneously. Thus the subjects were 
faking some groups up at the same time they were faking others down. 
Such being the case, our results are probably not as pronounced 85 they 
would have been if the subjects were trying to fake only one interest 
category. It must be kept in mind that when one interest category is 
faked it automatically changes the scores on other interest categories in 
the tests. 

The data were treated as follows: The number and р! 
women who did one of the following things were computed. 


er cent of men and 


Strong 


Moved from C or C+ to B—, B or B+ (one letter group up) 

Moved from A to B+,B or B— (one letter group down) 

Moved from C or C+ to A (two letter groups up) 

Moved from A to C+ or C (two letter groups down) i 

С or C+ to C or C+; В— B or Bt to B-, B or B+ (no faking up) 
A to A; B+, Bor B— to B+, B or B— (no faking down) 

A to A (could not fake up) 

C or C+ to C or C+ (could not fake down) 

Those who moved in wrong direction. 


Kuder 


25-74 percentile (one group up) 


Moved from 0-24 percentile to 
74-25 percentile (one group down) 


Moved from 75+ percentile to 


362 Howard P. Longstaff 


Moved from 0-24 percentile to 75+ percentile (two groups up) 

Moved from 75-- percentile to 0-24 percentile (two groups down) | 

0-24 percentile to 0-24 percentile or 25-74 percentile to 25-74 percentile - 
(no faking up) Ј 

75+ percentile to 75+ percentile or 74-25 percentile to 74-25 percentile 
(no faking down) 

75+ to 75+ percentile (could not fake up) 

0-24 to 0-24 percentile (could not fake down) 

Those who moved in wrong direction. 


Results * 


Tables 1 and 2 contain the data for the Strong and Tables 3 and 4 — 
portray the data for the Kuder. These data are summarized in Tables _ 
5, 6 (Strong), 7, 8 (Kuder), while Tables 9, 10 and 11 show the compara- 
tive data on the two tests.! 

These data indicate that even under the very complex and difficult — 
situation of simultaneously faking several different interest categories 
upward and downward, both the Strong and Kuder tests are vulnerable. 
Some interest categories were easier to fake than others for this group of 
subjects. Using the most rigorous criterion of faking, at least two letter 
grades (see Table 9), four interest categories, chemist, artist, author- 
journalist, and musician, on the Strong and one, artistic, on the Kuder are - 
successfully faked upward by over 49 per cent of the male subjects. Two 
other categories on the Strong, engineer and physicist, are successfully 
faked upward by over a third of the male subjects. Only two categories, 
carpenter and mathematician, on the Strong and four on the Kuder, 
mechanical, scientific, literary, and musical, are faked upward by less than 
one third of the male subjects. On the fake downward categories we find 
less successful faking but even here on four out of five categories on the 
Strong, accountant, real estate salesman, personnel director, and office 
man, are faked downward two letter grades by over 20 per cent of the 
subjects and on the Kuder two categories, persuasive and social service, 
are faked downward two groups by over 60 per cent of the subjects while 
clerical and computational are successfully faked downward from high to 
low by 27 and 19 per cent respectively of the male subjects. Thus, the 
Kuder was easier for these subjects to fake downward while the Strong 
was noticeably easier to fake upward. In general, the female subjects are 

*To reduce printing costs, Tables 1 to 8 inclusive have been deposited with the 
American Documentation Institute. Order Document 2513 from American Documen- 
tation Institute, 1719 N Street, N. W., Washington 6, D. C., remitting $0.50 for micro- 


film (images 1 inch high on standard 35 mm. motion picture film) or $0.50 for photo- 
copies (6 х 8 inches) readable without optical aid. 


Fakability of Strong Interest Blank 


к sg + e usw зоо 6t 12 твоме) 
Ж” 4 9t г 198 зодоол ср [ouuosoq 6c s9 9orA19g [81905 
o= gI = чешвојна "IH [вө 0% 
9 62 gz Sp- UBUSAVS “SUT IVI LI 29 OAISENSIOg 
S, £1 81 * 4u9wjunoooy £6 61 qwuorejnduro;) 
чмос-әҳея 
н £I 81 се чету 16 61 TON 
~ £I 5 се тчлпор-дочупу 1° 91 IBITI 
18 oF 6 ve Hy £8 or опелу 
= * +r ивтотувшоцувуҳ yt 
L az 1g ззтотвАҷа 18 
02 sg БР зәәш8ч [72 
8% 9r 81 Uu эвтшәчгу FL 00 ognueng 
g= 00 А or- зәўцәйтвгу 9 9r үвотавцәәрү 
{5} Á[rswvo әлош оришод әјешәд (gore Күіѕҹә олош СИМ IBW 
ола Yu 4) 00138 a +) 


Suong лорпмр Suong лорпмр 


*(uoriejtro сполоди) sdno13 10 ворвла 


19]}ә] олу “вәпо 3uwogrusiis оў £91098 quvogrusirsur шолу әзву pno? oua syoofqns jo yuo ләй 1830} ә uodn paseq st 9198) SYL :930N. 


dn-exei 
Jopnyy ‘вл 300135 


6 91481, 


се 19 28 ве usw YO 16 6c ouA 
ere c9 £L 00 103901 jouuosi9g 26 c6 eorA1eg TEPOS 
sI- 09 се— ueuise[eg ең [BOY £9 

£ 1L 86 те— usuisopeg "sup OFT vL 96 9Amensi9q 
g os $ 68 jusjunoooy n 86 үечоцвупашогу 

UMOC-OF8T 

w- oF £L [4 чоп 24 

81 29 oh Le uiof-1ogyny 88 

LI SL 6¢ © зву 26 

= 9r б че1отувшецуву tL 

et 19 ве звтотвАч 88 

yc 62 © Joouidurp 98 

82 28 . 99 Sy 3srureq c) 76 

ze- os z8 9 – зәўпәйтвгу £r 

( Арвво олош ојешод — ојеод (рохе Апево олош IBW 
g suvour +) uo1jg вивош +) 
а Suong зәрпу a: 3uong 


osoo ao 
кюю оор) вео qu АНРИ РО ОЧ ПРО тко аа тор uc pani ON OE :930N 
1-991 
Aopnyr ‘sa 300235 
OT әче, 


|. i — 


Fakability of Strong Interest Blank 


9 sı £t ву OWO yo T TLD 
n- oe ту gg- лодоолј јеччовлод 1g 0L 9ora10g [81008 
t ve = uwwuso[eg SH [VOU 62 
E» og we o- uwuiso[vg ‘вир ONT 0c 0L oasensied 
BL — Hacc ue a ee ee ee 


86 79 9 чету Lg 1g твәтѕпуү 

gr 69 oF ce нато f-10]3n y 28 19 KxexoyvT 
rat 19 eg © зву 98 £8 onsniy 
8£— 8 а чоувшәчүнрү n 
9r— oe Ze ова 1g 
ge se oF лоши IG 
eI 6c oF 98 эвтшәчгу 16 9 ognueng 

( Ápswoo1our ә[әшәд MPWA (рај Куба одош IBN 9rew 

013 SUVU +) 014g Бивош +) 
un dE аен а з е 


iopny "8A 2uong 
TT QPL 


366 Howard P. Longstaff 


occupational interests. 

When the less rigorous criterion of faking, one or two letter grades 
was used (see Table 10), we find over 74 per cent of male subjects fakin 
upward successfully on seven out of eight of the interest categories on 
Strong and in two out of the five interest categories on the Kuder. 
~ the fake downward categories all the occupational groups (herein me 

ured) on the Strong are successfully faked by 63 or more per cent of 
male subjects and three out of four of the categories on the Kuder ай 
faked downward by 59 or more percent of the male subjects. On 
more lenient criterion the women make a better showing but are sti 
somewhat less suecessful in faking than the men. 
The data we have just been discussing, those presented in Tables í 
and 10, may be misleading due to the fact that the Strong's Interest Blank: 
and Kuder Preference Record are scored differently. Scoreson the Kude 
are reported in percentile ranks based on a randomly selected group of sub 
jects which means that the majority of subjects who have low intere t 
ratings cluster around the mean or in the group we have designed as 25 
to 74 per cent. Thus they can’t move either upward or downward more 
than one group. Scores on the Strong blank are based not upon а ; 
randomly selected norm group but upon specific groups of successfully — 
employed men in the various occupations who have high interests in the 
interest categories for which keys are available. According to Strong, 
about 40 per cent of college students score at the C or C+ level on his. 
blank, which means in a typical group of college students about half of 
them have two letter grades available to fake upward on the Strong but ` 
only one group on the Kuder. In our data, approximately 79 per cent 
of males and 62 per cent of females score C or C+ originally on the fake 
upward categories on the Strong as compared to approximately 37 per 
cent of males and 26 per cent of females who score below 24 percentile on i 
the Kuder. In a similar fashion approximately 34 per cent of males and. r4 
31 per cent of females scored A originally on the Strong fake downward | 
categories as compared to approximately 55 per cent of males and 35 p 
cent of females who score over 75 percentile on the Kuder. Thus it 
apparent that our rigorous criterion (Table 9) gives the Kuder an unfs 
advantage. Probably a more realisite comparison is one based upon the — 
per cent of subjects who can fake upward to A scores on the Strong and _ 
to 75 percentile or aboye on the Kuder or downward to C scores on the 
Strong or below 24 percentile on the Kuder. "These are the significant: 
levels recommended by the authors of these tests and are the ones usually 5 
considered in vocational counseling. Table 11 presents such a compari- 


Fakability of Strong Interest Blank 367 


son. Here again the outstanding fact is that both tests are fakable, the 
Strong considerably more во than the Kuder on the fake upward categories 
and the Kuder more so than the Strong on the fake downward categories 
for the male subjects. The women are less successful in faking by this 
criterion, as they were on our other criteria. With a couple of notable 
exceptions, they are surprisingly good at faking the Kuder upward on the 
mechanical category and do much better than the men in faking the 
Kuder upward on the scientific category. 

Because the tests are fakable it does not follow that they are faked in 
general practice. Terman (6) found little faking on his masculinity- 
femininity test even when his subjects knew the purpose of the test yet 
when told to fake they were able to do so to a marked degree. The same 
is true with the Strong interest tests. Strong (7, pp. 686-687) makes the 
following statement: “The large number of correlations over .80 and 
particularly over .90 . . . are good evidence that there is remarkable 
consistency in response to interest items. A small amount of fudging 
would make such high correlations unlikely.” This is the best answer 
possible to our problem. The facts show that little faking goes on in the 

- guidance situation. However, what goes on in the hiring situation is an 
entirely different matter. It would seem from these data that special 
effort should be made by the examiner when using these tests to stress the 
desirability of truthful answers. Strong (7, p. 690) has given several sug- 
gestions for overcoming faking. These are: (a) emphasize speed, which 
reduces the time for thinking out faked answers; (b) view very high scores 
with suspicion because faking tends to produce abnormally high scores; 
(c) consider scores on secondary interests; (d) develop new norms where 
items obviously related to the occupation are reduced in weight or omitted 
entirely; and (e) emphasizing to the subject that his future will suffer if 
he gets into a job he dislikes. ) 

Brief comment on these suggestions із ів order. Very high scores may 
indicate faking but it does not follow that faking will always produce 
high scores. In our data, out of 531 possible “fake upward" scores only 
60 scores were as high as a standard score of 60 or above on the Strong 
blank. In regard to the suggestion that new norms be developed, 
Steward (5) has done this but still finds the reduction in the seriousness of 
faking “not enough to recommend the keys for their value as a protection 
device.” Steward (5) also strongly recommends that those who use the 
Strong test as part of the Steward selection battery for life insurance 
salesman should emphasize to the subject the futility of faking and 
thereby getting into a job which he may later dislike. * 
` Strong and Kuder might well consider the addition of a second set о! 
directions for their tests headed “directions to be used in employment 


368 Howard P. Longstaff 


offices.” These special directions should emphasize speed and the fact 
that in the long run the applicant is only prolonging his vocational diffi- 
culties if he fakes the tests. In view of this not too hopeful picture two 
other suggestions for further research seem indicated. Add a set of items 
to the blanks such as those used in the Minnesota Multiphasie Personality 
Inventory (3) and other personality tests which attempt to detect a sub- 
ject who is putting himself in an unduly favorable position (the L score 
on the MMPI). The difficulty with this procedure, however, is that 
the sophisticated subjects may not “bite” on these items. Another 
possibility is to make an empirical study of present items in the attempt 
to locate items distorted in faking but which are not obvious even to the 
sophisticated “faker.” The approach might well follow the techniques 
used in developing the “K scale" on the MMPI. 

Faking affects the interest maturity and occupational level scores 
on the Strong in the manner one would expect. The interest maturity 
scores drop from an average score of 57 on the original test to 45 on the 
fakes. As it happens, the occupations to be faked downward are those 
which normally have high I. M. scores and those to be faked upward have 
low I. M. scores. The lowering of the I. M. score in a way validates the 
fact that multiple faking was involved and that the results are not just 
due to changing one occupation and thus causing all others to change. 
It probably also is the result of choosing the more obvious and stereotyped 
items usually associated with the occupations in question. Since we find 
such a large shift in I. M. with faking further study of this phenomena is 
indicated as a possible indicator of faking. 

The occupational level remains practically unchanged, moving from 
an average of 56 for the original to 54 on the faked, illustrating the can- 


celling effect as one shifts some occuparions upward and the others down- 
ward. 


Summary 


Considering the Kuder as a whole and the thirteen interest categories 
herein studied on the Strong: 


1. Both tests are decidedly fakable. 

2. Some interest categories are more fakable than others. 

3. Women are less successful in faking than men. 

4. The Strong test in general is easier to fake upward than the Kuder, 
while the Kuder is easier to fake downward than the Strong. 

5. It does not necessarily follow that much faking goes on in actual 
use of these tests. The potential danger is present, however. 

6. The interest maturity and occupational level scores behave as 
would be expected. Further study of the I. M. scale as an index of faking 
is indicated. 


Fakability of Strong Interest Blank 369 


7. А new set of directions should probably be made for both tests in 
order to minimize faking. 

8. Further research is indicated to explore the possibility of devel- 
oping an empirical scale to detect faking. 


Received November 14, 1947. 


References 


. Benton, A. L., and Kornhauser, G. I. А study of "score faking” on a mechanical 
interest test. J. Ass. Amer. med. Colleg., 1948, 23, 57—60. 
2. Bordin, E. S. A theory of vocational interests as dynamic phenomena. Educ. & 
Psych. Meas., 1943, 3, 49-65. 
3. Meehl, P. E., and Hathaway, S. R. The К factor as a suppressor variable in the 
Minnesota Multiphasie Personality Inventory. J. appl. Psychol., 1946, 30, 525- 
564. 
. Paterson, D. G. Vocational interests inventories in selection. Occupations, 1946, 
25, 152-153. 
5. Steward, V. The problem of detecting fudging on vocational interests tests. Los 
Angeles, Calif.: Personnel Reports for Sales Executives. January, 1947. , 
Terman, L. M., and Miles, C. C. бел and personality. New York: MeGraw-Hill 
Book Co. 1936. 
Strong, E. K., Jr. Vocational interests of men and women. Stanford, Calif.: Stanford 
University Press. 1943. 


- 


~ 


= 


+ 


Words Are Dynamite * 


R. Stafford Edwards 
Edwards and Company, Inc., Norwalk, Connecticut 


When Noah Webster's wife caught him kissing the cook she is sup- 
posed to have exclaimed, “Why, Noah, I'm surprised," and he to have 
answered, “Мо, dear, I am surprised, you are astonished.” 

Another story has it that one wife got so annoyed at another who 
bragged about her husband being sophisticated that she looked it up in 
the dictionary . . . and had the great pleasure at their next meeting of 
agreeing that he was indeed a fakir . . . for its meaning truly is “falsely 
or fallaciously worldly-wise.” 

Such misconception of words has caused much amusement and little 
damage but imagine the tragedies that might occur if some evil-minded 
group succeeded in educating children to the misconception that “poison” 
meant “bread.” An erroneous use of words has been instilled into re- 
lations between employers and employees, and even those who do not 
poe there is real class hatred fan its fires by constant misuse of those 
words. 

Employers, union officers and high government officials who wouldn’t 
tolerate loose handling of dynamite near their headquarters toss explosive 
words around with utmost carelessness. 

Foremost in that class is the word “labor.” Dictionaries define it 
simply as “mental or physical toil.” There is no definition giving it 
status as representing any group of people. In the eras when Europe had 
suppressed groups their leaders coined terms to fan their hatred of the 
idle group which did the suppressing. In France the term “Bourgeoisie,” 
for example, really defined the great middle class which is typical of our 
American population. 

There is no "landed" class or “nobility” in the United States. In 
fact, in France the term “Democrat” was synonymous with “Bourgeoisie” 
and the United States has grown to be commonly known as а “TDemoc- 
racy” for perhaps that reason . . . although the term inaccurately de- 

* The editor solicited this paper from Mr. Edwards, who is President of an organi- 
zation that has been in existence for 76 years. Its amicable relations with employees 
and high rate of production has never been interrupted. The care with which they use 
words may be in part responsible. Whether readers agree or disagree with his particular 
terminology, they will welcome having their attention directed to the dangers that lie 
in the use of emotionally charged words.—Eprror. 

370 


Words Are Dynamite 371 


scribes our type of government which is really “republican,” with gov- 
ernment by representatives of the people, not by the people themselves. 

During the era leading up to the Russian revolution the sickle and 
hammer became the symbols for the *workers" but you may perhaps re- 
call that at that time Americans shuddered in horror at the term “Bol- 
shevik." It seemed to us all that was cruel, ruthless and vicious but 
really meant “Extreme Left Socialist.” They are still “Bolsheviki” and 
the newer term “Communist” is deseriptive of a form of government only. 

Because there is no real class difference in this country the term- 
inology had to revolve around those who, at the moment, were employers 
and those who were employed even though they were all workers at their 
own jobs. In the drive to sell employees the idea of unionization the 
word “labor” has been mutilated to mean everything representative of 
those who work . . . even to the point of spelling it with a capital as we 
do the proper noun “American.” 

Leaders of the union movement have further exaggerated the word 
"labor" as inclusive of all who work when, in fact, their own statistics, 
and those of the Department of Labor, show considerably less than half 
of those who work to be union members. 

Without for one moment questioning the right or the desirability of 
employees to organize into unions, it is nevertheless true that these terms 
were appropriated by leftists and extremists to create a class struggle in 
this country that would make organization easier . . - as it was in the 
countries having actual class differences. So, in place of “nobility” they 
wrapped everyone who employed into a class named “managemen! ” or 
“industry.” Of course, that must include the little fellow who, until 
yesterday, was employed but who today starts his small grocery store or 
gas station. At the same moment employers developed an apoplectic 
hostility towards anyone who favored employee organization and repre- 
sentation and called him a “Bolshevik.” Е 

Along with the misuse of the word "labor" as meaning а mass of 
People instead of “phycical or mental toil” union leaders invented s лет 
catch phrase which was pretty glamorous; “labor is not а commodity. 
Here again you have the meaning of “labor” distorted to represent а 
mass of people . . . and по one could disagree with the premise that 
buying and selling people would be slavery. But “mental and physical 
toil” is obviously rented or hired just as any other commodity and at 
Managerial and professional levels as well as unskilled levels. 

Both the leftists and rightists have done a devastating job. More 
through carelessness and laziness than through agreement, their erro- 
neous terminology has been adopted by the press, those in high govern- 
ment office, the courts and, most regrettably, һу conciliation agencies 


372 R. Stafford Edwards 


whose obvious function is to dispel the idea that there is any real class 
struggle in our employer-employee relations problem. Every piece 
of printed matter, from newspapers to company notices on bulletin boards, 
tosses “labor and management” or “labor and industry" phrases into a 
class hatred fire that has little other combustible material in our form of 
life and government. Every day some radio commentator uses such 
phrases as "there appears to be a split in the ranks of labor" or “the 
forces of labor and management are squared away for a finish fight” . . . 
to describe what is actually nothing but а normal difference of opinion in 
the process of negotiation. 

This lurid and false picture of one mass of serfs called "Labor" being 
downtrodden by a small but selfish group called “Managetient” or “In- 
dustry” is further irritated by such meaningless terms as “labor relations” 
and “industrial relations.” The problem is being still further intensi- 
fied by habitual use of degrading terms such as “worker” instead of the 
more dignified term “employee.” 

If the word “labor” need be used at all it should be confined to its 
correct meaning, to designate the work done. Everyone who works is an 
“employee” of someone or some organization. Our problem is to further 
good relations between that employee and his employer. The mass of 
people who are employed has no better designation than “employees” 
and the mass of those who employ, “employers.” ‘The dividing line be- 
tween them is indeed small and subject to quick change from one side to 
the other. Is it not the ambition of most employees to become an em- 
ployer? 

There is true distinction between the large majority of employees 
who do not care to belong to a union and those who do; the only correct 
terminology for the latter is “unionized employees” and consequently the 
mass terminology is simply “union” or “unions” . . . not “Labor.” 

Employers can accomplish much to promote better employer-employee 
relations by sticking closely to true terminology in all notices and press 
releases, and by using calmer, more dignified terms such as “the company " 
instead of “the management" and “employee” instead of “worker.” 

There are plenty of fundamental, non-inflammatory facts to justify 

and recommend the theory of unionization without resort to instituting 
a spurious class struggle in this nation, born without it. Union leaders 
and employers could well work together to find less inflammatory phrases 
in their written and printed interchanges than “demands” . . - why not 
“proposals”? Why should any contract deal at length with such phrases 
as “grievance procedure" and “Grievance Committee,” etc.? A griev- 
vance is a “wrong.” Why not “Determination Procedure" and “Deter 
mination Committee"? 


Words Are Dynamite 373 


In conclusion it might be pointed out that these recommendations 
for а more intelligent use of words in dealing with this problem have the 
firm foundation of all good psychological practice . . . bring the truth 
to the surface and deal with it instead of being led into a devastating 
mirage of misconception and untruth by use of emotionally toned words. 
Such words are dynamite. 


Received May 21, 1948. 
Early publication. 


A Technique for the Construction of Attitude Scales 


Allen L. Edwards and Franklin P. Kilpatrick 
The University of Washington 


Earlier articles (3, 6) have reviewed the various methods which have 
been used in the construction of attitude scales: the method of equal ap- 
pearing intervals developed by Thurstone (16), the method of summated 
ratings developed by Likert (14), and the method of scale analysis devel- 
oped by Guttman (9). The method of equal appearing intervals and the 
method of summated ratings are similar in that both provide techniques 
for selecting from an initial large number of items, a set of items which 
constitutes the measuring instrument. Scale analysis differs from these 
two methods in that it is concerned with the evaluation of a set of items, 
after the items have been selected in some fashion or another. 

In the method of equal appearing intervals, items of opinion are sorted 
by a judging group into 9 or 11 categories constituting a continuum rang- 
ing from unfavorable to favorable. The scale value of each item is found 
by locating the point on the continuum above which and below which 50 
per cent of the judges place the item. The spread of the judges’ rating is 
measured by Q, the interquartile range. A high Q value for an item indi- 
cates that the judges are in disagreement as to the location of the item on 
the continuum and this, in turn, is taken to mean that the item is ambigu- 
ous. Both Q and scale values are used in selecting items for the attitude 
test. Approximately 20 items with scale values equally spaced along 
the continuum and with low Q values are selected for the test. Scores 
on the test are determined by finding the median of the scale values of 
the items with which a subject agrees. 

In the method of summated ratings, items are selected by а criterion 
of internal consistency. Subjects check whether they strongly agree 
agree, are undecided, disagree, or strongly disagree with each item. 
Numerical weights are assigned to these categories of response using the 
successive integers from 0 to 4, the highest weight being consistently 
assigned to the category which would indicate the most favorable attitude. 
A high and low group are selected in terms of total scores based upon 
the sum of the item weights. The responses of these two groups are then 
compared on the individual items and the 20 or so most discriminating 
items are selected for the attitude test. A subject’s score on this test 18 
determined by summing the weights assigned to his responses to the 20 
items. 


| 374 


43 


Technique for Construction of Attitude Scales 375 


In scale analysis, a complete set of items is tested to determine whether 
they, as a group, constitute a scale in the sense that from the rank order 
score it is possible to reproduce a subject’s response to the individual 
items. The degree to which this is possible is expressed by a coefficient 
of reproducibility. Although ordinarily Guttman uses 10 to 12 items, 
to give a simple explanation of this coefficient let us suppose that we have 
but 3 items, each with but 2 categories of response, agree and disagree. 
We shall assume that the agree response, in each instance, represents a 
favorable attitude and the disagree response an unfavorable attitude. 
A weight of 0 is assigned to the disagree response and a weight of 1 is as- 
signed to the agree response. Let us also suppose that for the first item 
we have in our sample 10 subjects with weights of 1 and 90 with weights 
of 0; for the second item we have 20 subjects with weights of 1 and 80 with 
weights of 0; and for the third item we have 40 with weights of 1 and 60 
with weights of 0. 

In the case of perfect reproducibility, the 10 subjects with weights of 
1 on the first item will be the 10 subjects with the highest rank order 
scores. These 10 subjects will also be included in the 20 who have weights 
of 1 on the second item and these 20, in turn, will be included in the 40 
who have weights of 1 on the thirditem. It would also be true that only 
4 patterns of item response would occur, if the set of items were perfectly 
reproducible. For the sample at hand, these patterns and the scores 
associated with them would be: AAA-3; DAA-2; DDA-1; DDD-0. 
Since all responses could be perfeetly predicted from the scores, the coeffi- 
cient of reproducibility, in this instance, would be 100 per cent. Perfect 
reproducibility is seldom found, however, and in practice a coefficient of 
85 per cent or higher is believed satisfactory for judging a set of items to 
be a scale.? Various techniques for computing the coefficient of repro- 
ducibility have been developed and are described in the articles by 
Festinger (7), Clark and Kreidt (2), and Guttman (11, 12). 

Scale analysis, in the sense mentioned above, thus becomes a tech- 
nique secondary to the problem of item selection. The important prob- 
Edwards and 


376 Allen L. Edwards and Franklin P. Kilpatrick 


Jem is to obtain а set of items which the investigator may have some as- 
surance will scale when a partieular technique of testing for scalability is 
applied. Up to the present time, the problem of item selection in scale 
analysis seems to have been left largely to the intuition and experience of 
the investigator. The only practical rules suggested are that one should 
simply rephrase the same question in slightly different ways (7, p. 159) 
or that one should look for items with as homogeneous content as possible 
(12, p. 461). This latter suggestion indicates that if we are interested 
in the problem of attitude toward the Negro, we should break this universe 
of content down into sub-universes constituting perhaps such areas as 
attitude toward the Negro in public eating places, attitude toward the 
Negro as a resident in the community, attitude toward the Negro as a 
voter, attitude toward the Negro as an employer, attitude toward the 
Negro in public conveyances, and so on. But even here, we find that 
attitude toward the Negro, let us say, in public conveyances can be 
broken down into areas of content even more homogeneous by enumera- 
ting the specific conveyances: streetcars, busses, trains, planes, and so on. 
Each of these areas of content might possibly be broken down into still 
more homogeneous areas. Eventually, we may end up, as Festinger sug- 
gests, with multiple rephrasings of the same question, and our two rules 
are thus but one (7, p. 159). 

Obviously, any technique which enables us to select a set of items 
from the large number of possible items, with some assurance that the set 
of items selected will, in turn, meet the requirements of scale analysis 
would be of great value. In this paper, a technique which has proved 
successful in doing this is described. For reasons which will become clear 
as we proceed, we have called this technique the scale-discrimination 
method of attitude scale construction (5). 


The Scale-Discrimination Technique 


The scale-discrimination method is based upon preliminary investiga- 
tions which showed that the cutting point‘ of an item is related to the 
Thurstone scale value of the item and that the reproducibility of an item 
is related to the discriminatory power of the item (6). The discrimina- 
tory power of an item, it has also been shown, is not, as might seem at 
first glance, merely a function of the item’s scale value. It can easily be 
demonstrated that items with comparable Thurstone scale and Q values 


* The cutting point of an item marks the place in the rank order scores of the subjects 
where the most common response shifts from one category (agree) to the next (disagree). 
Between cutting points, in a perfect scale, all responses would fall in the same category. 

* The reproducibility of an item is measured by degree to which responses to the 
item can be reproduced from the rank order scores of the subjects. 


Technique for Construction of Attitude Scales 377 


may differ tremendously in their power to differentiate between those 
with favorable and those with unfavorable attitudes.* 

Statements of opinion concerning science were collected from a variety 
of sources. Books and essays were consulted. Individuals were asked 
to express their opinions in brief written statements. We eventually 
collected 266 statements of opinion about science. In editing these items, 
particular attention was paid to eliminating those items which: (1) were 
liable to be endorsed by individuals with opposed attitudes; (2) were 
factual or could be interpreted as such; (3) were obviously irrelevant to 
the issue under consideration; (4) appeared likely to be endorsed by every- 
one or by no one; (5) seemed to be subject to varying interpretations for 
any reason; (6) contained a word or words not common to the vocabularies 
of college students. Also, due to emphasis upon the matter during both 
the collecting and editing of the statements, most of the 155 statements 
finally selected expressed a clear-cut favorable or unfavorable opinion 
about science. 

Thirteen other items, which might be called “control ” items, were 
added to the original 155. These 13 items were added to determine how 
they would fare at various stages of the scale-discrimination method. 
Of the 13 items, we judged that 7 were “neutr. ”” items in the Thurstone 
sense; 2 were items which could possibly be interpreted as factual; 1 was 
believed to be too extreme for many endorsements; 1 was judged am- 
biguous because the words “scientific holiday" could be interpreted as 
meaning a moratorium or as meaning a celebration; 1 was judged ambig- 
uous because more than one dimension was involved; and 1 was judged 
irrelevant. Thus there were 168 items in all which were used in test- 
ing the scale-diserimination method of scale construction.’ 


Determining Scale and Q Values of the Items 


Envelopes numbered 1 through 110 were prepared. In cach Bante 
we placed a set of 3 X 5 cards lettered A, B, С, р, 2n pare bi lip 
pack of slips of paper approximately 2 x 4 inches in size. On each slip 


of paper one of the 168 items was printed along with the number of the 


item. In each case the pack of slips was shuffled so that the items would 
given to an elementary 


be arranged in no set order. The envelopes were 
psychology elass along with a set of instructions describing the Thurstone 
1 ted" would 

*For example, the extreme item: “All Republicans should ре areatan. 
undoubtedly show а scale value at one extreme of the continuum and а ще ми 
Q value. But this item will not differentiate between those with favorable an a 
favorable attitudes toward Republicans for the obvious reason that both groups wo 
Probably react in the same fashion to the item. Е LT 

716 should be aee, that the inclusion of the "control" items mentioned is not 


{ое considered part of the scale-discrimination procedure. 


378 Allen L. Edwards and Franklin Р. Kilpatrick 


sorting procedure and the members of the class were asked to sort the _ 
items in accordance with the instructions. { 

The item sortings of each subject were examined and we discarded 
those subjects whose sortings showed obvious reversals of the continuum 
or failure to carry out instructions. On this basis we were left with 82 
completed sets of judgments. 

Frequencies of judgments in each of the 9 categories for each item J 
tabulated, translated into cumulative frequencies, and then into cu - 
tive proportions. An ogive was plotted for each item with cum e 
proportions on the axis of ordinates and scale values on the axis of ab- 
scissas. Scale values were read totwo decimal places (the second decimal 
place being merely an approximation) by dropping a perpendicular to 
the baseline of scale values at the point where the cumulative proportion 
curve crossed the 50 per cent mark. In a similar fashion Q values were 
determined by dropping perpendiculars at the 25th and 75th per cent _ 
levels, Q being the scale distance between these two points or the inter- 
quartile range.’ 

The 168 items were then plotted in a bivariate distribution according _ 
to scale and Q values, the scale values being plotted on the baseline. | 
The distribution of scale values was bimodal in shape There were very 
few items in the “neutral” section (none at all in between 5.0 and 5.9), 
the modal categories being 1.0 to 1.9 and 7.0 to 7.9. The Q values of the 
7 items which did fall in the “neutral” scale interval (4.0 to 4.9) were 
quite low, 6 of the 7 falling well below the median Q value for all 168 
items. All 7 of these items were “control” items, described previously. 

A line was drawn through the distribution at approximately the 
median Q value of all the items, 1.29. All items with Q values above 
this point were rejected. We worked from here on with the remaining 
83 items or with approximately the 50 per cent of the initial set of items 
with the least degree of ambiguity as measured by Q. One of the “neu- 
tral” control items was eliminated by this standard and 6 were ас- 
ceptable. These 6 items all had scale values between 4.0 and 4.9. No 
items at all had been found in the scale interval 5.0 to 5.9 and the Q _ 
eriterion eliminated all items in the interval 3.0 to 3.9. One of the 2 


* This task was most laborious. Almost 14,000 slips of paper had to be sorted and 
then tabulated. Some judging technique similar {о that used by Ballin and Farnsworth 
(1) or Seashore and Неупег (15) would reduce much of this labor, but even here the z 
task is not simple. Various methods which simplify the judging process are now being 
tried and will be reported upon in another paper. H 

? This operation was simplified by Setting up а master chart with the cumulative 
proportions on the Y axis and the scale values on the X axis. This chart was then 
taped to a ground-glass plate which fitted over an enclosed wooden box containing 8 - j 
100 watt bulb. Tracing paper could then be placed over this chart and the ogives for 
the individual items quickly drawn. 


nre. 


Technique for Construction of Attitude Scales 379 


factual items was rejected by the Q criterion and the ambiguous item 
with the words “scientific holiday” was also eliminated. The remaining 
10 “control” items would have to be judged acceptable by the Q criterion. 


Item Analysis 


actions. Each item was followed by a 6 point forcing scale (strongly 
agree, agree, mildly agree, mildly disagree, disagree, strongly disagree). 
Subjects were instructed to check for each item the one expression which 
most nearly described their own attitude with respect to the item. In 
all, 355 subjects filled out the questionnaire: 245 from sociology, psy- 
chology, and speech classes at the University of Washington; 60 from a 
local junior college; and 50 from а police school. Of these 355 papers, 
346 were usable, 9 of them being incomplete or having more than one 
answer for a single item. 

Scoring was done in the usual Likert fashion, weights of 0 through 5 
being assigned to the 6 response categories, the weight of 5 being given to 
the strongly agree response in the case of items expressing à favorable 
opinion about science, and to the strongly disagree response in the case 
of items expressing an unfavorable opinion about science. For the 6 
items in the scale interval, 4.0 to 4.9, the direction of the weights was 


resulting scores. The obtained range of scores was only 64 per cent of the 
possible range (140-405 obtained, 0-415 possible) with considerable 


Two criterion groups were chosen, approximately the upper and lower 
27 per cent, in terms of total scores. The range o! 
94 papers was from 140 to 300 and the upper 94 papers had scores ranging 
from 343 to 405. The 83 items were then subjected to item analysis. 


For each item, frequencies in each of the response categories for the high 
and for the low group were tabulated. The 6 categories were then re- 
duced to 2 by combining categories 0, 1, 2, 3, and 4." From the pa 
sulting 2 x 2 tables, phi coefficients were calculated." The phi coeth- 


cients ranged in size from .16 to .78. 
ects gave predominantly favorable 


1 This grouping was necessary because our subj i 
responses to the ris If our universe of cw Sac pee aitab аа 
unions, we would expect a more symmetrical bution of EAU 
a different grouping of categories. urgensen 
11 The nomographs by Guilford (8) or the tables prepared by Ј (13) make 


these calculations quite simple. 


380 Allen L. Edwards and Franklin P. Kilpatrick 


Next the 83 items were plotted in a bivariate distribution with phi 
values on the Y axis and scale values on the X axis." The 4 items with 
the highest phi coefficients were selected from each half-scale interval; 
due to the previously mentioned gaps in the scale continuum, this in- 
volved only the intervals from .5 to 2.5 and from 6.5 to 8.0. No items 
were selected from the “neutral” control items in the scale interval 40 
to 4.9. The 28 items thus selected were assigned to Forms A and B of 
the questionnaires by alternating scale values between the two forms. 

The final scales then consisted of 14 items each, with the items very 
closely equated as to Thurstone scale values, Q values, and phi values. 
For Forms A and B, respectively, the mean scale values of the 14 items 
were 3.85 and 3.91; the mean Q values were .90 and .92. Phi coefficients 
of the items in Form A ranged from .58 to .78 with a median value of 
65; for Form B they ranged from .58 to .76 with a median value of „66. 
Only 1 of the remaining 10 “control” items had a phi value above .58. 
This was one of the 6 “neutral” items and it had a phi value of .61. The 
other “control” items would be rejected by the phi criterion. 


Reliability and Reproducibility of the Scale 


The reliability coefficient of the two forms of the scale, 14 items versus 
14 items, based upon the responses of 248 new subjects was .81, uncor- 
rected. For both forms of the test the range of scores was quite re- 
stricted, 30 to 70 in each case with possible ranges from 0 to 70. Within 
this restrieted range, bunching at the upper, or favorable, end was present. 
The mean score for Form А was 58.22 and the standard deviation was 
7.33. For Form B the mean was 57.20 and the standard deviation 
was 7.79. 

Scale analysis based upon the performance of a sample of 87 subjects 
drawn from the larger group of 248 subjects was carried out with both 
forms of the test by the Cornell technique (11). А coefficient of repro- 
ducibility of 87.5 per cent was obtained for Form A and a coefficient of 
reproducibility of 87.2 per cent was obtained for Form B. Response 
categories in each instance were dichotomized. Cutting points were 
established and we observed Guttman's rule that “по category should 
have more error than non-error" (11, p. 17). The range of modal re- - 
sponse categories was from .51 {о .82 for Form A. The mean value of 

з A plot of phi values against Q values indicated no discernible relationship, the 
variability within columns being approximately the same as the total variability. This 
would indicate that in the procedure followed here, the scale-discrimination procedure, 
the phi analysis adds to the process of item selection when items with comparable Q 
values are used. We have, it may be recalled, already eliminated the 50 per cent of the 
items with the highest Q values. The relationship between the discriminatory роже? 
of an item and Q value when this is not the case is described in another paper (4). 


Technique for Construction of Attitude Scales 381 


the modal categories, .57, which is the minimum value! of the coefficient of 
reproducibility for this set of items with the sample at hand, may be 
compared with the observed coefficient of reproducibility of 87.5 per cent. 
For Form B the range of the modal categories was from .52 to :67. The 
mean value, which again is the lower limit of the coefficient of reproduci- 
bility, was .57, whereas the observed value of the coefficient of reproduci- 
bility was 87.2 per cent. 

Тће two observed values of the coefficient of reproducibility are suf- 
ficiently high to constitute evidence that but a single dominant variable 
is involved in the sets of items or that, in other words, uni-dimensionality 
is present. Such sets of items are said to be scalable or to constitute & 
scale. The coefficients of reproducibility also mean that it is possible to 
reproduce item responses from rank order scores with the accuracy indi- 


property: the simple correlation between rank order scores and an ех- 
ternal criterion will be equal to the multiple correlation between the items 


unambiguous and that it is possible to make meaningful statements about 
one subject being higher (more favorable) than another on the variable 
in question.5 This would not be true of a test involving more than one 
variable. Suppose, for example, a test involves two variables. Then a 
subject might obtain a given score by being high on one variable and low 
onthe other. Another subject might obtain the same score by being high 
on the second variable and low on the first. From the rank order scores 
alone it would be impossible to tell the relative positions of the subjects 
on the two variables, and the interpretation of the composite score is 
ambiguous. Statements of “higher and lower than” might be made, but 
we would not know what the “higher and lower t 4 referred to, 
for by increasing or decreasing the number of items related to either 


А ibili ingle item cannot be 
1 This is the lower limit because the reproducibility of aay Mage tta саш 
less than the frequency in the modal category. The method of ig ume y um 
value of the coefficient, assumes independence of the items. See Guttman 02 
М See foot у au Yaa 
1 In а pa у perfect, scales, where the coefficient of reproducibility 18 unity, it 
also follows that an individual with a low rank order score will not koh given а more 
favorable response to any item than any person with a higher rank order score. 


382 Allen L. Edwards and Franklin P. Kilpatrick 


variable, the rank order scores of the subjects could be altered.'* This 
would not be true of a test in which the items all belong on a single 
continuum, that is, а test which is uni-dimensional. In such a test, in- 
creasing the number of items would not shift the rank order scores of the 
subjects. 


Summary 


The method of scale construction described in this paper has been 
called the scale-discrimination method because it makes use of Thur- 
stone’s scaling procedure and retains Likert’s procedure for evaluating the 
discriminatory power of the individual items. Furthermore, the items 
selected by the scale-discrimination method have been shown, in the case 
described, to yield satisfactory coefficients of reproducibility and to meet 
the requirements of Guttman’s scale analysis. The scale-discrimination 
method is essentially a synthesis of the methods of item evaluation of 
Thurstone, Likert, and Guttman. It also possesses certain advantages 
which are not present in any of these methods considered separately. 

The scale-discrimination method, for example, eliminates the least 
discriminating items in a large sample, which Thurstone’s method alone 
fails to do. The unsolved problem in the Thurstone procedure is to 
select from within each scale interval the most discriminating items. 
Items within any one scale interval may show a high degree of variability 
with respect to a measure of discrimination. For example, we found 
within a single interval items with phi values ranging from .24 to .78. 
That Thurstone’s criterion of Q does not aid materially in the matter of 
selecting discriminating items is indicated by the plot of phi values against 
Q values, after the 50 per cent of the items with the highest Q values had 
already been rejected. Under this condition, items with Q values from 
1.00 to 1.09 had phi coefficients ranging from .32 to .76. Thurstone’s 
method also, by the inclusion of “neutral” items, tends to lower reliability 
and to decrease reproducibility of the set of items finally selected (6). 

Thus when selecting items by Thurstone’s technique alone, we have 
no basis for making a choice between items with comparable scale and 
Q values, and yet these items are not equally valuable in the measurement 
of attitude. By having available some measure of the discriminatory 
power of the items, the choice becomes objective as well as advantageous 
as far as the scale itself is concerned.!7 

* We do not mean to imply by this discussion that multi-dimensional scales are 
without value. 

17 Additional research may indicate that the Thurstone scaling procedure is not 


necessary. See, however, the articles by Edwards and Kilpatrick (6) and Clark and 
Kreidt (2). 


Technique for Construction of Attitude Scales 383 


The advantage of the scale-diserimination method over the Guttman 
procedure lies essentially in the fact that we have provided an objective 
basis for the selection of a set of items which are then tested for scalability. 
It may happen that not always will the scale-discrimination method 
yield a set of items with & satisfactory coefficient of reproducibility, But 
this is not an objection to the technique any more than the fact that not 
always will a set of intuitively selected items scale. Rather, it seems that 
the scale-discrimination method offers greater assurance of scalability 
than any intuitive technique such as applied by Guttman. Furthermore, 
the set of items selected by the scale-discrimination technique provides 
a wider range of content than do the intuitive Guttman items. In the 
scale-discrimination method, we obtain items which are not essentially 
multiple phrasings of the same question as is often true when the selection 
of a set of items to be tested for scalability is left to the experience of the 
investigator (7, p. 159). 

Several different areas of content are now being studied by variations 
of the scale-discrimination method and the results of these 
should provide additional evidence concerning the realtionship between 
the scale-discrimination method and scale analysis. 


Received January 2, 1948. 


References 


1. Ballin, M., and Farnsworth, P. R. A graphic rating method for determi : T3 
scale values of statements in measuring social attitudes. J. soc. Paychol., s 


2. Clark, К. E., and Kreidt, P. H. An application. of Guttman’s new кил, vor 
niques to an attitude questionnaire. Unpublished paper, 1947. To be pul 


3. Edwards, А. L, and Kenney, К. C. A RETRE жб 72-83 
techniques of attitude scale construetion. J. appl. ^ А d by the 
4. Edwards, A. L. A critique of “neutral” items in sap гг scales 
method of equal appearing intervals. Psychol. Rev., 19 
5. Edwards, A. L. and Kilpatrick, F. P. m rtr 2 MUN 
uring social attitudes. Amer. Psychol., тала, measurement of social 
6. Edwards, A. L., and Kilpatrick, F. P. pam analysis and the Ar 
attitudes. Psychometrika, 1948, 13, June. ‚ is" Psychol. 
7. Festinger, 1. The treatment of qualitative data by “scale изума Р 
Bull., 1947, 44, 149-161. і indi í ity- 
8. Guilford, J. P. The phi coefficient and chi square as indices of item validity. 
Psychometrika, 1941, 6, 11-19. iol 
9. Guttman, 1, A basis for scaling qualitative data- Te зале 
139-150. is. Resear 
10. Guttman, L. Questions and answers about scale an Report E mà mS 
mation and Education Division, Army Sg S Ен Муна, і 
. Guttman, І. The technique for scale anderen 


graphed, 1946. 


- 
= 


384 Allen L. Edwards and Franklin P. Kilpatrick 


12. Guttman, L. On Festinger's evaluation of scale analysis. Psychol. Bull., 1947, 


44, 451-465. 

13. Jurgensen, C. E. Table for determining phi coefficients. Psychometrika, 1947, 12, 
17-29. 

14. Likert, R. A technique for the measurement of attitudes. Arch. Psychol., М. Y., 
1932, No. 140. 


15. Seashore, R. H., and Hevner, K. A time-saving device for the construction of 
attitude scales. J. soc. Psychol., 1933, 4, 366-372. 

16. Thurstone, L. L., and Chave, E. J. The measurement of attitude. Chicago: Univ. 
Chicago Press, 1929. 


— 3 


A College Achiever and Non-Achiever Scale for the 
Minnesota Multiphasic Personality Inventory 


William D. Altus 
University of California, Santa Barbara College 


While working with illiterate soldiers as а. personnel consultant during 
the late war, the writer found that an adjustment test, devised for other 
purposes, had considerable validity for predicting *tacademic" achieve- 
ment, if what these men of low intellectual caliber learned to do in terms 
of literacy may be called “academic” (1) The adjustment of these 
unlettered men, as determined by a 36-point, orally-administered test, was 
just as important as their intelligence, as determined by the Army 
Wechsler, in differentiating between those soldiers who would graduate 
and those who would fail and be sent home. This finding is noteworthy 
only because it is at variance with previous studies, of which Bell’s (2) 
may be considered as typical, where the correlations obtained between 
adjustment tests and grade averages did not deviate significantly from 
zero. The present study is one more attempt to find some significant 
relationships between the way college students respond to adjustment 
items and the type of grade average which they earn, intelligence being 
held constant. у 

The method of equated groups Was used, the basis of the equating 
being the standard scores earned on the Altus Measure of Verbal Aptitude. 
This measure of aptitude was considered adequate for the purpose Bee 
it had given a validity coefficient previously of .64 with elementary 
psychology grades. The population from which the two groups was 
drawn consisted of two classes in elementary psychology at the Santa 
Barbara College, University of California, during the spring semester of 


1947. The average standard score on the first two semester tests in 


{ academic achievement. pin 
iring 1 i iteria: f they 

уеге Bo they met the following criteria: (1) I 

nd as И үйөт of each other on the Measure 


of Verbal Aptitude; (2) if one of the pair had an Average standard i 
on the first two psychology tests at least one-half sigma above his standar 
Score on the Aptitude Test; (3) and if the other of the pair had а com- 
parable score at least one-half sigma below his measured aptitude. 
Table 1 shows clearly that the two groups 
Well equated on the basis of general aptitude. 
385 


386 William D. Altus 
Table 1 
Certain Statistical Data on the Two Equated Groups 
OO tee шш —————————Є—Є—== 
Achievers Non-Achievers 
Measure Mean Sigma Mean Sigma C.R. 
Aptitude 51.48* 6.85 51.52* 6.77 — 
Psych. Gradet 60.30* 4.60 41.86* 5.38 12.99 
H.P. Ratio** 1.99 57 1.16 32 6.38 


үе. E И ee 2 ____________---= 


* Standard scores, mean of 50, sigma of 10 for the total population. 

t Psychology grade: Final average standard score in elementary psychology, based 
on three semester tests and one final examination. 

++ Honor point ratio, in which A = 3, B = 2, C = 1, D = 0, and F = —1. Ratio 
represents all college work taken. 


it will be noted that the two groups are less variable in aptitude than the 
population from which they came, ie., the two classes in elementary 
psychology, since the sigma is considerably smaller than ten. This dif- 
ference is simply due to the fact that it is impossible to vary too far below 
an already low aptitude in terms of grade achieved; similarly those quite 
high in aptitude could not meet the criterion of being one-half sigma 
above their tested aptitude. For these reasons it was impossible to in- 
clude the relatively quite dull or the very bright in the study. 

It will be noted from Table 1 that the two groups were almost two 
sigmas apart in average psychology grade. Although of the same general 
aptitude, one group earned an average grade of B in elementary psy- 
chology and the other group fell at the dividing point between а C and à 
D. The critical ratio of this difference in average psychology grade is 
12.99. The average grade earned in all subjects taken in college does 
not show quite the same divergence for the two groups, 1.99 and 1.16. 
One group, here called the achievers, was earning an approximate B 
average in college while the other, the non-achieving group, was earning 
only slightly better than the required C average. The critical ratio of the 
differences in average college grades is, however, 6.38, showing that а 
statistically significant difference did exist. А 

In the present study the only factor held constant was general apti- 
tude. It would have been desirable, perhaps, to hold sex and age com 
stant but the population did not afford a sufficient pool of students for 
any further matching. In the group working above capacity—the 
* Achievers"—there were 22 men and 3 women; for those working below 
their capacity—the “Non-Achievers”—there were 16 men and 9 women. 
It is rather striking that three times as many women were in the non- 
achieving group as there were in the achieving group. It is probable that 


College Achiever and Non-Achiever Scale 387 


the competition engendered by the G. I. Bill and the relatively greater 
maturity of the male students, who average 4.5 years older, are responsible 
for the sex differences between the two small groups here studied. 

The 50 members of the two groups were administered the group form 
of the Minnesota Multiphasic Personality Inventory, hereafter called the 
MMPI. It was felt that the wide range of scales and individual items 
afforded by this test might reveal some significant differences between 
the two groups, these differences being in some way associated with non- 
intellective factors of etiological significance in grade getting. The dif- 
ference in the average scores of the two groups on the various MMPI 
scales are presented in Table 2. It will be seen that the four non- 


Table 2 


Mean Differences Between the Achieving and Non-Achieving Groups 
on the Various Scales of the MMPI 


moo en 


Non- У Non- 

Achievers Achievers Achievers Achievers 

Achiever олии гайып Wee uL 
Scale Mean Mean Scale Mean Mean 
? 50.00 50.00 Pd 58.00 58.80 
L 51.84 53.64 Mf 56.56 53.84 
K 59.16 59.08 Pa 53.48 55.40 
ЈЕ 52.60 53.56 Pt 56.24 57.56 
Hs 52.76 53.20 Sc 58.32 60.92 
b 51.20 5432 | Ma 54.28 61.18 

Hy 58.04 59.40 


clinical scales, ?, L, K and F, show mean scores that are quite alike. 
Three of these non-clinical scales, ?, L and F, are very close to the norma- 
tive standard score of 50. ‘The K scale, however, while it does not show 
any difference between the two groups in mean scores, does show a rather 
marked elevation in average score for both groups, some nine-tenths ofa 
standard deviation above the MMPI norms. The usual interpretation 
of an elevated K scale is that it represents а defensive test-taking at- 
titude, either conscious or unconscious. Rather interestingly, however, 


Meehl and Hathaway (4) have shown that this scale is to а certain degree - 


correlated with educational level, college students tending, оп the average, 
to make higher mean scores than js true of the entire population. Part 
for the two groups is doubtless due 
to this educational factor. It may also be possible that motivational 
factors operating in markedly over- and under-stimulated college stu- 
dents—in terms of scholastic achievement—are associated in some degree 
with defensive test-taking attitudes such аз would d 
elevated K score. Until norms for stratified samples of college students 


388 William D. Altus 


are available on the MMPI, it will be fruitless to carry conjecture any 
further concerning the etiology of such anomalies as here appear on the 
K scale. 

The most noteworthy finding in Table 2 is the marked similarity be- 
tween the two groups in mean score on the various clinical scales. Some 
of the scores are somewhat elevated, as was the K score, but when the 
score for one group is elevated, it tends to be elevated to about the same 
degree for the other group. V The trend for eight of the clinical scales їз їй 
the expected direction—that is, for greater maladjustment on the part 
of the non-achieving group. Only for the Mf scale is the direction 
. reversed; here the difference may be somewhat suspect because of the 
disparity between the number of men and women composing the two 
groups. Relatively high means for the two groups may be noted on at 
least three of the clinical scales, Hy, Pd and Sc. у The height of two of 
these three scales, Sc and Pd, may be in part accounted for by the rela- 
tively high scores on the K scales, since the size of K determines the size 
of the correction to be applied to Se and Pd; thus these two scales are in 
part a function of К. The implicit cautiousness apparent in the height- 
ened K scale average is not reflected, though, in the otherwise rather high 
Hy averages, 58.04 and 59.40, since K is not used as a correction on this 
clinical scale. One explanation for the elevated means of the two groups 
on the Hy scale is again the educational artifact noted in the p i 
paragraph in the discussion of the K scale—college students tend generally 
to earn higher scores on the Hy scale than is true of the general popula- 
tion. Whatever the causes or causes for the deviations of the two groups 
from the normative score of 50, however, it will be noted that their 
averages remain quite safely within the bounds of statistical normality. 
And since neither the normality nor the abnormality, per ве, of the two 
groups here studied is the purpose of the present investigation, the 
etiology of the mean score deviations will be discussed no further. 

* On one of the MMPI scales there is a significant difference between 
the groups in mean score. On the Ma (Hypomania) scale there are 6.9 
Standard points difference in the means of the groups. 'The difference 
converts into a critical ratio of 2.96, significant at the .01 level. None 
of the other differences on the remaining eight clinical scales 18 significant 
even at the .05 levely When it is remembered that, excepting for the 
scale, all of the differences are in favor of the greater maladjustment í 
the non-achieving students, it may be assumed, despite the lack of statis- 
tical certainty, that probably a true difference, even though slight, may 
obtain between the two groups. Insofar as the Ma scale pos 
validity, the data would seem to suggest that the overactive, restless, try- 
too-many-things type of person is a somewhat poorer student, on the aver- 


College Achiever and Non-Achiever Scale 389 


age, than is the better controlled, less active fellow student. Whether 
there is any connection between this finding and that of Harris (3) who 


other, that one of the non-achievers who is in the top twelve per cent of 
local college students in verbal intelligence received an Army discharge 
for psychoneurosis. After five years of successful service as а non-com 
in the Regular Army, he was cashiered from an OCS for striking an 


70 or higher on this scale; only one of the achieving group reached ue 
If these data are representative of college students as а whole, it may be 
assumed that the chances of any individual student with an Ma score of 
60 or higher working up to capacity in academic subjects are two s 
against. ~ The number of students in the present study is 80 small = че 
possibility of a biased sampling 80 great that even 80 tentativo ве 
rial assumption must be buttressed by further research before it сап M 
accepted. ' ce 
\A further test was taken in order to determine whether nu ae 
vidual items among the 567 in the MMPI could be jsolated by item 
analysis and fused together to form а jal measure of the 
lective factors in grade-getting. A tabulation was made of | 
answers to each of the items in the MMPI for the two groups. Since 


one item was great enough to justify the ! 
scale. The 60 items which showed a difference of 5 or more 


answers are shown in Table 3. ; : ed “Yes” 

It will be noted from Table 3 that the non-achiever iiie M 
more frequently to the first 42 items while the achievers answe = 
more often to the last 18 items. The whole range of 60 items represents а 


i 1 drome. 
grabbag of symptoms which accord with no clearly defined m 


However, the feminine cast of the non-achievers— perhaps as à 


390 


William D. Altus 


Table 3 


Items in the Group MMPI Which Discriminated Between the Achieving and 


Non-Achieving Groups by Five or More Points 


— MM 


“Yes” 
Answers 


Item 


orte уюш ______________---=- 


18-11f 
104 
13-7 


18-10 

17-8 

10-5 
8-3 


10-5 


2888 N SRSSSES 


. I do not mind being made fun of. 


I like collecting flowers or growing house plants. 


` When I was a child I belonged to а crowd or gang that tried to stick 


together through thick and thin. 


. I like to flirt. 

. I have very few fears compared to my friends. 

. Sometimes I become so excited I find it hard to go to sleep. 

. Sometimes some unimportant thought will run through my mind and 


bother me for days. 


„ I wish I could get over worrying about things I have said that may have 


injured other people's feelings. 


. I like to keep people guessing what I’m going to do next. 

. At times I have worn myself out by undertaking too much. 

. I dream frequently. 

. Usually I would prefer to work with women. 

. I can remember playing sick to get out of something. 

. While in trains, buses, etc., I often talk to strangers. 

. I have a daydream about life which I do not tell to other people. 

. I usually work things out for myself rather than get someone to show 


me how. 


. І strongly defend my own opinions as а rule. 
. A large number of people are guilty of bad sexual conduct. 


The one to whom I was most attached and whom I most admired а8 & 
child was a woman. (Mother, sister, aunt or other woman.) 


. I work under a great deal of tension. 
. My sex life is satisfactory. 


I have had very peculiar and strange experiences. І 
Iam а good mixer. 

I have never done anything dangerous for the thrill of it. 

Ilike to cook. 7 

I have the wanderlust and am never happy unless roaming ОГ traveling 
about. Ў 

It wouldn't make me nervous if any members of my family got шо 
trouble with the law. 

I very much like hunting. 

I never worry about my looks. 


. If I were an artist I would like to draw flowers. " 
. Most people make friends because friends are likely to be useful to them. 


* The items preceded by an asterisk showed discrimination value аз non-intellective 
items when administered to a new group and with a new criterion—average of all grades 2 
earned in college. 

f The number given first is the number of “Yes” answers marked by the Non- 
Achievers; the second number is the number of “Yes” answers for the Achievers. 


College Achiever and Non-Achiever Scale 391 
Table 3—Continued 


Answers Item 


19-11 32. I like Alice in Wonderland by Lewis Carroll. 

23-18 33. I have no dread of going into a room by myself where other people have 
gathered and are talking. 

23-16 34. I am not afraid of fire. 

6-1 35. I often think I wish I were a child again. 

8-3 36. I am apt to take disappointments so keenly that I can't put them out 
of my mind. 

134 37. If given the chance, I would make a good leader of people. 

24-16 38. I enjoy social gatherings just to be with people. : 

24-15 39. Except by а doctor's orders I never take drugs or sleeping powders. 

10-3 40. I am fascinated by fire. 

17-11 41. IF T were in trouble with several friends who were equally to blame, I 
would rather take the whole blame than to give them away. 

20-15 42. I have no fear of spiders. А 

9-14 +43. Tam apt to pass up something I want to do when others feel that it isn't 
worth doing. 

410 44. When I was a child, I didn't care to be а member of а crowd or gang 

13-18 45. I like to read newspaper editorials. 

8-13 46. One or more members of my family is very nervous. ДЕ 

4-9 46. One or fien so annoyed when someone tries to get abead of me in а line 
of people that I speak to him about it. 

5-10 48. T am not likely to speak to people until they speak to me. 

й I sweat very easily, even on cool days. 


49. 
10-15 50. І can read а long while without tiring my eyes. 
51 


с . т in petty thievery. 
ne oa Daing ono PE, oan U aap тање та 
13-19  *55. Hem read about science. 
4-9 255. T like to rond baal o RU LEE D RN De РА 
2-7 +. jeu Iam more Бау os sit by myself or with just one other person 
14-19 +58. Meis. piri and free from family rule 


the differential representation of the sexes in the two groups—is quite 


noticeable: Those who work considerably under their capacity like to 


cook (25), would like to collect flowers (2) or draw flowers (30), like to 


work with women (12) and аза child loved some individual woman most 


of all (19). Immaturity is also present: The non-achievers like to keep 


392 William D. Altus 


people guessing what they're going to do next (9), often wish they were 
a child again (35), have daydreams which they do not tell other people 
(15). Relative fearlessness is also claimed by the non-achievers: They 
are not afraid of spiders (42) or of fire (34), claim that they have fewer 
fears than their friends (5). Self-assertiveness is another of their chara- 
teristics: The non-achievers work things out for themselves, they say (16), 
strongly defend their own opinions (17), feel that they would make good 
leaders of people if given the chance (39). Manic trends are also ap- 
parent: They admit to occasional excitement so great that sleep is im- 
possible (6), to liking to travel so much that happiness is à concomitant 
of traveling or roaming about (26), to having worn themselves out at 
times by attempting too much (10). {Femininity, immaturity, fearless- 
ness, self-assertiveness and manic tendencies are, then, certain de- 
scriptive adjectives which appear to characterize the answer- vf the попе 
achievers to differentiating items in the MMPI. \ These tre. 's are, how- 
ever, really minimal compared with the strength of the y:tiable which, 
for want of a better term, will be called social extroversion./ The non- 
achiever belonged to a crowd (3) or gang as a child, would take the whole 
blame if his crowd got into trouble (41), likes to talk to people on trains 
and buses (14), likes to be in social gatherings just to be with people (38), 
is not disturbed by entering a room where people are talking (33), never 
worries about his looks (29), though when he does worry it is usu 
about a social matter, i.e., something he said that may have offended 
others (8). 

In an obverse way, the social variable also characterizes the achiever 
who, academically, is working above his capacity :\He sits by himself or 
with just one other person when at a party (57), does not speak to people 
until he is spoken to (48), is annoyed by those who get ahead of him in а 
line of people (47), feels people would use unfair means to get ahead (52), 
is so reserved he finds it difficult to defend his own rights (56). X Less 
social, more reserved, the achiever is also characterized, in his marking 
the group MMPI, by opposite tendencies from those inferred in the 
preceding paragraph for the non-achiever—that is, he is more mature in 
his attitudes, less feminine, not so socially assertive, though he claims to 
be free and independent of family rule, untroubled by manic tendencies 
and admits to more fears than the non-achiever. One feels that the 
differential between the two groups here considered fits fairly well the 
stereotype of the introvert and extrovert, \though not forgetting that the 
60 items in Table 3 contain too many diverse characteristics to allocate 
to a single continuum without making an unwieldy string of poorly 


1 This statement is worthy of being emphasized by a footnote: Note that the wording ® 


is “characterize the answers,” not “characterize the personalities.” Operationally, the 
behavior studied is the marking of items. 


College Achiever and Non-Achiever Scale 393 


matched beads. With this word of caution in mind, it may be said that 
the present data show the one who works markedly under his capacity in 
college to be an immature, somewhat manic social extrovert while his 
opposite number who works above his capacity is а rather aloof, well- 
controlled introvert. 

The discussion relative to the discriminating items of the MMPI has 
thus far been concerned with the characteristics of the two groups as 
inferred from the way they responded to the individual items. It is 
clear from the manner of their derivation that the items in Table 3 
should show quite divergent means for the two groups. The mean scores 
of the two groups on the 60 items, scoring one point for each minus answer, 
1 through 42, one point each for each plus answer, 43 through 60, is 24.6 
for the non-achieving group, 39.6 for the achieving group. The question 
remained, however, as to how efficacious the present scoring would be if 
applied to a new population. Consequently, the 60 items were tried out 
on a new group, consisting of 85 students in the elementary psychology 
classes. The first criterion used was the term grade earned by these 85 
students, predicated upon three semester tests and a final examination 
with a combined reliability for the four tests of .96. The criterion in 
this instance was both objective and highly reliable. 

\The same scoring was used for the 85 students on the 60 items as for 
the experimental groups. The Pearson-product coefficient of correla- 
tion with the criterion of psychology grades was 390. Asa check of the 
possible saturation of the scores thus derived with intellectual factors, 
the MMPI items were correlated with standard scores derived from the 
Altus’ Measure of Verbal Aptitude. Rather surprisingly to the investi- 
gator, r proved to be .15, showing a slight positive relationship with in- 
telligence which was presumed to be ruled out by means of the original 
equated groups. Three possible reasons for the residual correlation of 
-15 with intelligence are these: (1) The full range of aptitude was not 
represented in the experimental groups, owing to the criteria used in 
their selection; (2) it is probable that the manner of selection of the ex- 
perimental groups caused a biased sampling, that is, selecting those 
working significantly above or below their tested aptitude, so that the 
non-intellective factors which characterize them are not perfectly charac- 
teristic of other students of like aptitude; or (3) it may be that the cri- 
terion of intelligence here employed, the Altus Measure of Verbal Aptitude, 
incorrectly shows the two groups to be the same in aptitude when the 
Variation between their respective college grade averages (which is in 
itself a fairly adequate criterion of intellectual capacity) was so great— 
and as a consequence, the loading of the presumptively **non-intellective" 
MMPI items with intelligence will show up to an appreciable degree 


394 William D. Altus 


when the items are applied to а new group, as happened in the present | 
ease. Despite these restrictions, however, the bias is not too great since 
the per cent of overlap between the MMPI items and grades achieved in 
psychology is 15.21 (squaring the coefficient of .39) while the per cent of 
overlap between MMPI and aptitude is only 2.13. Roughly 13 per cent 
(15.21% less 2.13%) of the relationship between MMPI items and psy- 
chology grades represents non-intellective factors, i.e., factors not at- 
counted for through the measure of aptitude employed. 

Feeling that the criterion of psychology grades was too parochial, even 
though completely objective and highly reliable, the writer next com- 
puted honor point ratios for each of the 85 students, in which all grades 
earned at the local college were averaged so that a Pearsonian coefficient 
could be computed. The manner of computation of the honor point ratio 
is given in a footnote to Table 1. Ther between the 60 points from the 
MMPI and the honor point ratio was .23, a coefficient which does not 
quite meet the requirements for significance at the .01 level, though it 
does at .05 level. An r of this size, .23, is too close to the intercorrelation 
of aptitude and the 60 points from the MMPI, .15, for the former to be 
of much use in a multiple correlation for predicting the present criterion, 
the average college grade. Consequently, the 60 MMPI items were 
analyzed by the upper and lower quartile method, the criterion of the 
analysis being, of course, the honor point ratio of the individual student. 
Twenty-six of the 60 items were found to be associated with college gr ade 
average. In Table 3 these 26 items have an asterisk preceding the 
number of the item. The remaining 34 items from the original 60 were 
во close to zero in the quartile analysis that they were discarded in the 
final scoring. Twenty-five of the 26 items which discriminated when the 
criterion was changed showed differences in the same direction as deter- 
mined by the experimental groups. The one item which changed direc- 
tion was item 43, “I am apt to pass up something I want to do when others 
feel that it isn't worth doing." This item seems logically to tap nearly 
the same attitude as does item 60, “My conduct is largely controlled by 
the customs of those about me,” but apparently it is a psychologically 
rather different question. The likenesses among the two types of scoring 
are so much greater than this shift of one item might imply that one must 
conclude that the non-intellectual factors entering into total grade average 
and into a single highly reliable course grade are closely akin. The 26 
items retain a sample of the social, infantile, feminine and manic ten- 
dencies of the non-achieveing student similar to those found in the mother 
lode of the original 60 items. 


When the papers of the 85 students were re-scored on the basis of the 


26 items thus derived, the following Pearson product-moment coefficients 


College Achiever and Non-Achiever Scale 395 


of correlation resulted: with honor point ratio, .39; with elementary psy- 
chology term grades, .40; with the Measure of Verbal Aptitude, .21. The 
somewhat surprising aspect of the r's just given is not that the r with 
honor point ratio increased from 23 to .39—that would, be expected 
owing to the manner of selecting the 26 items—but that the new scoring 
is slightly better for the original criterion, grades earned in psychology, 
than it had been when all 60 items were used. The difference is obviously 
not significant, .39 to .40; what is significant is that the original validating 
coefficient of .39 did not drop at all when a new criterion was used for 
selecting the items to be scored. The raw inference would be that non- 
intellective factors which are associated with the more inclusive honor 
point ratio are approximately the same as those which correlate with a 
single, highly reliable course grade. The reverse finding does not appear, 
however, to be true, that all non-intellective items associated with a 
single course grade are the same as for the less parochial average college 
grade. It is of parenthetical interest that more effective items might, 
perhaps, have been derived from the MMPI if the original basis for selec- 
ting the experimental groups had been the honor point ratio instead of an 
average standard score on two psychology semester tests. 

It will be noted that the saturation with aptitude in the two scorings, 
the 60-item test and the 26-item test, rose from .15 to .21. The latter 
coefficient is not quite significant at the .01 level. Although this r of 
21, aptitude versus the 26-item test, may appear to be relatively large, 
it does not markedly reduce the validating coefficients of .39 and .40 for 
the two criteria since both coefficients remain above .30 when aptitude 
is held constant by partial correlation technique. The presence of any 
overlap whatever between what was supposed to be non-intellective items 
and intelligence does indicate, however, а fault in the technique em- 
ployed. Better than using relatively small groups equated in aptitude 
and differing in grade achievement would have been the survey of all 
students taking elementary psychology on the MMPI as well as on the 
test of Verbal Aptitude. 1f both tests had been administered to all 
Students, the well-known quartile technique of item analysis could have 
been employed with both test variables, the criteria being the same as 
those here employed, grades in psychology and honor point ratio for all 
college grades. By the use of this technique, no items correlating posi- 
tively with aptitude would have been retained, thus assuring that the 
score derived from items so isolated would have a zero correlation with 
aptitude. In this manner questions could have been found which are 
truly non-intellective. And it is through valid non-intellective scales 
that a higher order of prediction for academie work will be made possible 
Since the tests thus derived will not overlap the functions measured by 


396 William D. Altus 


the traditional intelligence test, which has proved validity for the pre- 
diction of academic success. 


О Summary 


Two equated groups of elementary psychology students were given the 
group form of the MMPI. The basis of the equating was equality of 
intelligence test score and divergence in terms of psychology test scores of 
one-half sigma or more above or below the intelligence test score. In 
this manner two groups of 25 students each were obtained, one group 
being called the Achievers since it represented students working one- 
half sigma or more above their tested aptitude while the other, the non- 
Achievers, consisted of those working one-half sigma or more below 
their tested aptitude. The following findings may be summarized: 


1. The trend on eight of the nine clinical scales of the group MMPI 
was for slightly greater maladjustment on the part of the non-achiev- 
ing students. 

2. The only scale showing significance at the .01 level between the 
mean scores of the two groups was Hypomania. 

3. Subsequent item analysis of the MMPI in terms of final psychology 
grade revealed 60 items which showed a difference of five or more points 
between the two groups. A study of the 60 items indicated that the 
answers of the non-achieving group could be characterized as revealing 
greater femininity, immaturity, fearlessness, self-assertiveness and manic 
tendencies than the achieving group. The best single bi-polar concept 
characterizing the answers of the two groups seemed to be the traditional 
introversion-extroversion, when emphasis is placed upon its social aspects. 
The answers of the achievers revealed introversive tendencies; those of 
the non-achievers, a love of and a dependence on people, here called 
social extroversion. 

4. When the 60 “non-intellective” items were administered to a new 
group of 85 students, the 60 items correlated .39 with psychology term 
grades. The same score yielded an r of .23 with honor point ratios for 
total college grades of the 85 students. 

5. When the 60 non-intellective items were analyzed by the upper, 
lower quartile method with honor point ratios as the criterion, 26 items 
were retained. "These 26 items correlated .39 with honor point ratio, 40 


йт psychology term grades, .21 with the intelligence test used їп the 
study. 


The data here reported, though based on a small number of cases, 
appear to justify the belief that if the correct method of selecting them 
is used, adjustment items can be found which will be associated with 


College Achiever and Non-Achiever Scale 397 


academic achievement and have no relation whatever to intelligence as 
it is currently measured. The usefulness of such a non-intellective scale 
in conjunction with a valid intelligence test in predicting academic 
achievement needs no elaboration. 


Received January 3, 1948. 


References 


1. Altus, W. D. The adjustment of Army illiterates. Psychol. Bull., 1945, 42, 461—476. 

2. Bell, Н. M. The theory and practice of personal counseling. Stanford University 
Press, Stanford University, California, 1935. 

3. Harris, R. E., and Christiansen. C. Prediction of response to brief psychotherapy 
J. of Psychol., 1946, 21, 269-284. 

4. Meehl, P. Е., and Hathaway, S. R. The K factor as a suppressor variable in the x 
Minnesota multiphasic personality inventory. J. appl. Psychol., 1946, 30, 525- 
564. 


College Grades and the Group Rorschach 


Grace M. Thompson 
University of California at Berkeley 


The lack of а perfect correlation between college grades and intel- 
ligence, as it is currently measured, is probably due in part to imperfec- 
tions in the measures of intelligence as well as to the unreliability of the 
criterion employed. Yet to a greater degree it is undoubtedly an indica- 
tion that other factors besides sheer academic ability are of considerable 
importance in determining any single student’s academic success. The 
measurement of such personality factors, therefore, seems to be of para- 
mount importance to present-day education, whether in its guidance, 
grouping, or admissions programs. A measure of these non-intellectual 
factors is unfortunately not yet available in the armamentarium of the 
tester, who continues to rely solely on the easily obtained aptitude test 
scores. 

One of the promising techniques now being employed to measure the 
adjustment aspects of academic success is the Group Rorschach, which 
has already been used to advantage on a large scale by Ruth Munroe and 
her colleagues at Sarah Lawrence College, where it is now standard en- ` 
trance procedure. In her monograph on this topic, Munroe (4) reports 
the rather surprising finding that her Inspection Rorschach (an ab- 
breviated check list to be used by examiners well acquainted with the 
test) was associated with grades to a somewhat higher degree than the - 


ACE Psychological Examination, one of the most widely used aptitude TC 


tests at the college level. Whereas the ACE was more successful in 
predicting success, the Rorschach was more successful in predicting 
academic failure. 

Further indications of the validity of the Group Rorschach used at the 
college level in the differentiation of achieving and non-achieving college 
students, when intelligence was held constant, are to be found in a study 
by Montalto (3) at the University of Cincinnati. Such findings are 
certainly hopeful for the prognosis of large scale personality measurement. 
It would appear, however, that several aspects of the scoring and evalua- 
tion of the test must be simplified before such results could be achieved 
consistently: a fully standardized method of scoring, sufficiently objective 
so that ideally it would require less stringent training than is necessarily 
demanded at present of its scorers; and a thoroughly quantified inter- 

398 


College Grades and Group Rorschach 399 


pretation which would allow the assignment of a numerical value to each 
protocol, thereby avoiding the present subjectivity of the test's usage, 
even in the hands of Munroe. 

Тће following study represents a limited attempt to investigate the 
possibility of using the Group Rorschach in predicting academie success 
by those factors inherent, within the test which are associated with grades 
but not related to intelligence as we are now able to measure it. 

The Group Rorschach was administered to a beginning psychology 
class at Santa Barbara College of the University of California, using the 
standard slides projected on a screen and following Munroe’s method of 
administration. The class was composed of 128 students, who were en- 
rolled in a representative sampling of the college curricula. Sixty-three 
per cent were men; thirty-seven per cent were women. 


Table 1 
Rorschach Items Investigated 


eee 
. Total number of responses (R) 
Number of responses using the whole blot area (W) 
Number of responses using large detail areas (D) 
Number of responses using small detail areas (Dd) 
. Per cent of total responses in which animals the primary content (A7%) 
Per cent of responses using the whole blot area (W 7%) 
. Per cent of responses using large details (0%) 
. Per cent of responses using small details (Dd%) 
. Number of responses using the white spaces (8) 
10. Total number of popular responses (P) 
11. Per cent of responses on last three cards (8+9+10%) 
12. Per cent of responses using animals and humans as primary content (A+H%) 
13. Ratio between whole human and animals figures and human and animal details, as 
legs, eyes, etc. (A+H:Ad+Hd) 
14. Ratio of whole human figures to human details (H:Hd) 
15. Total responses on the achromatic сатав (Ach R) 
16. Total responses on the chromatic cards (Cbr В) 
17. Per cent of total responses on the achromatic cards (Ach%) 
18. Total number of content categories (Cont. Cat.) 
19. Anatomy and sex responses (An, Sex) 
. Number of human movement responses seen (M) 
- Human movement responses in small blot areas (M in Dd) 
- Number of responses with animals in motion (FM) 
- Ratio of human to animal movement responses (M:FM) 
^ Number of vista or perspective responses пи 
. ај of responses describing movement of natural forces or generally inanimate 
objects (m) 
+ Pure color responses, or those in which color is the primary determinant and form 
Secondary (C, CF) 


DNSe NH 


8 


Grace M. Thompson 


Table 1—Continued 


Color sum: evaluating each pure color response as 11, each color-form as 1, each 

form-color as } and summating values 

Ratio between М responses and color sum (M: C) 

. Number of responses using shading as determinant (Y) 

Ratio of whole responses to human movement responses (W:M) 

Per cent of responses determined purely by form of blot (F%) 

Number of good form responses (F+) 

. Per cent of good form responses within total form responses (Е+- 7%) 

Number of poor form responses (F—) 

Presence of popular M response in Card III (M in III) 

Presence of popular M response in Card II (M in II) 

Presence of popular response for Card V (P in V) as a whole 

Presence of popular response in lateral detail of Card VIII: animal, bear, ete. 

(P in VIII) 

. Statement by subject that color used as determinant 

Organization total for whole series of cards (Z) 

Average organization value per response (Ave Z) 

"Total organization on the achromatic cards (Ach Z) 

Total organization on the chromatic cards (Ch 2) 

. Per cent of organization total on the achromatic cards (Ach 2%) 

. Presence of any responses on the last three cards utilizing the whole blot with no 

other determinant than form (WF last 3) 

Refusal of any of the cards 

Presence of any of the following: color naming, blot description, marke-i persevera- 

tion of one response t 

49. Total poor form responses on all cards, including minus responses for all determi- 
nants: M—, ЕС—, ete. 

50. Per cent of popular responses (РФ) 

51. Number of space and small detail areas combined (Dd+S) 

52. Total number of shading and perspective responses (КУ + У) 


вЕЕББЕВ SBSSSESSESE 8 


в 


Each test was scored according to the usual method for the scoring 
of an individual administration, following Beck (1), whose numbered 
delineations of areas, tables for plus and minus form values, and broad 
categories of symbols for determinants seemed to lend themselves best 
to the desired objectivity and reliability demanded by the group method. 
Each protocol was scored in addition for Klopfer's FM, or animal move- 
ment, and m, or the movement of natural forces and generally inanimate 
objects, which the research of Munroe and others had indicated to be 
significant items. A list of 52 Rorschach factors (see Table 1) generally 
considered to be of interpretative significance was then used as а basis 
for summarizing the protocols of the 128 students; it will be observed 


that each faetor was one which could be put down in a single purely 
quantitative form. 


' College Grades and Group Rorschach 401 


In order to determine, first of all, which of the 52 Rorschach items 
were associated with grades, it was necessary to extract the top and bot- 
tom quartiles from the class distribution in semester grades. The single 
criterion of the final grade in the psychology 1A class was used, since this 
grade indicated a highly reliable criterion, being based on a series of objec- 
tive tests whose combined reliability had been previously shown to be 
96. The two extreme quartiles were then compared on the findings 
of each of the separate 52 Rorschach categories in order to discover those 
particular items which showed a direct relationship to the criterion. Any 
item which showed a difference greater than four points between the 
absolute number of cases in each of the extreme quartiles having the 
item in question was retained as being of possible differentiating value. 
For example, item 1, or total number of responses: 23 students in the 
upper quartile had less than 30 responses, whereas only 15 in the lower. 
quartile had less than 30 responses. The difference between the quartiles ` 
(23-15) exceeding four, the item was retained as of possible value. 

Thirty-four items met this minimum requirement; since the method 
of selection was admittedly only a rough and general one, each of the 34 
was given an equal weighting of one, and each separate Rorschach pro- 
tocol assigned a numerical score on the basis of how many of the 34 items 
the paper in question possessed. When these tentative numerical scores 
were correlated with the original criterion of grades, a Pearson product- 
moment coefficient of .52 was obtained, indicating that there із a definite 
relationship between certain factors in the Rorschach and the present. 
criterion. 

Since comparison of the two quartiles had indicated some of the 
Rorschach items to possess far more discriminating value than others, 
16 of the most valid items among the 34 were employed in a second cor- 
relation to see whether or not a smaller number of items would yield as 
adequate a validating coefficient. When this shortened scoring was cor- 
related with psychology grades, r was found to be .50. 

Since the original purpose of the study was to investigate chiefly the 
Personality or adjustment factors related to academic success, it seemed 
likely that this coefficient was spuriously high, since many of the 34 items 
may have been simply Rorschach variables reflecting the same sort of 
Intelligence that one of the usual aptitude tests could have measured 
with a much smaller expenditure of time and effort. Accordingly, each 
of the original 52 items on which the protocols were scored was again 
item-analyzed, this time using the top and bottom quartiles of aptitude, 
аз measured by the Altus Measure of Verbal Aptitude, a short verbal test 
Which had been shown to give a correlation of .64 with the original cri- 
terion of grades. 


402 Grace M. Thompson 


When the aforementioned 16-item Rorschach scoring was correlated 
with aptitude, it was found to give a coefficient of .43, high enough to 
insure—even without running а multiple correlation—that the addition 
of the Rorschach scoring in that form would add little to the prediction 
of college success beyond that already offered by the aptitude test, 


Table 2 
"Rorschach Items Associated with Grades 


Grade 
Quartiles 
о Rorschach Item 
23-15 . RÍfewer than 30 
25-19 W more than 4 
24- 8* D fewer than 18 
1- 3* Dd 0 or 1 

6-2 A% under 25% 
18-10* W% more than 24% 
20-12 D% under 65% 
22-17 Dd% under 15% 
29-22* S fewer than 4 

5-2 P more than 8 
21-14* 8+9+10% under 35% 

8-4 A+H% between 41% and 54% 
25-21 A+H:Ad+Hd equal to or more than 2(Ad+-Hd) 
19- 4* H more than 2Hd 

8-1 Chr R fewer than 10 
21-13* Achr% more than 44% 

25-21 Cont. Cat. fewer than 13 
28-22 M more than 2 

25-20* M more than FM 

17- 8* m present 

20-12 FC more than C plus CF 
30-23* C sum less than 4.5 

28-16* C sum less than М 

25-19* W:M equal to or less than 2:1 
31-26 F+% more than 69% 

15- 8 C not stated as determinant 
30-25 Z more than 19 

9- 4* Ave. Z two or more 

7-12* Ave. Z less than 1 (negative weighting) 
29-20* Ach Z more than 9 
20- 8 Ach Z% more than 39% 
20-11* "Total neg. R fewer than 3 
31-26 P% more than 14% 

26-19 Dd plus 5 fewer than 9 
12- 6* FY plus Y =0 or 1 


* Items retained for shortened scoring with grades. 


College Grades and Group Rorschach 403 


Table 3 
Rorschach Items Associated with Aptitude 


Aptitude 
Quartiles 


оо 
R fewer than 30 


27-19 W more than 4 
20-13 D fewer than 18 
28-17 Dd fewer than 5 
23-12 A% under 40% 
20-11 W% more than 24% 
19-14 D% under 65% 
24-14 Dd% under 15% 
27-20 8 fewer than 4 
16-12 P more than 6 
29-21 A+H% between 40% and 74% 
22-17 A+H more than 2 (Ad+Hd) 
21-8 H more than 2Hd 
15- 9 Chr R fewer than 15 
22-14 Ach % more than 44% 
20- 6 M more than 3 
19-14 FM fewer than 3 
26-13 M more than FM 
17-12 FV present 
20- 8 m present 
19%- 8 FC more than C plus CF (or both 0) 
24-15 C sum less than M 
22-13 W:M equal to or less than 2:1 
16- 8 F% more than 60% 
17-11 F+% more than 84% 
30-22 M in III 
20-15 M in II 
32-24 Z more than 19 
ll- 1 Ave. Z more than 2 

2-17* Ave. Z less than 1 
30-18 Ach Z more than 9 
18-13 Ach Z% more than 39% 

4-11* Persev., color naming, blot desc. 
19- 8 Total neg. R fewer than 3 

P% more than 14% 


Dd plus 8 fewer than 10 


E 


404 Grace M. Thompson 


When the best 35 were extracted from the original list of 52 and a quanti- 
tative scoring assigned to each student's paper by summating the number 
out of the 35 which his protocol possessed, а correlation of .51 was ob- 
tained between this scoring and the scores on the aptitude test. It was 
interesting to note that even within this homogeneous group of college 
students, the single item of M—or human movement responses seen in 
the cards—gave an r of .37 when correlated singly with aptitude. 

In order to avoid an overlap between the functions of the aptitude 
test and the Rorschach, it was necessary to retain in the final Rorschach 
scoring only those items which had, through a comparison of the two 
separate sets of item analyses, proved themselves to be positively associ- 
ated with grades, yet at the same time minimally or negatively associated 
with aptitude. It will be seen, however, from even a cursory examination 
of Tables 2 and 3 that such items were relatively rare; therefore, several 
were retained which showed a positive relation to aptitude, so long as 
their relation to grades was more marked. 

Twenty items fulfilled the requirements, and were accordingly used 
as the basis of a final scoring of the Rorschach protocols for the non-in- 
tellective factors related to school achievement. The same procedure 
was followed as before, that is, each of the 20 items present was assigned 
a weighting of plus one, and the individual students’ records given а final 
numerical score according to how many of the 20 his particular test had 
possessed. A Pearson product-moment correlation of .38 was obtained 
between this scoring and the original criterion of term grades, whereas 
this same Rorschach scoring gave a coefficient of only .04 when correlated 
with the Altus Measure of Verbal Aptitude. The correlation between 
this Rorschach scoring and grades when aptitude was partialled out was 
46. A multiple correlation, showing the combined influence of the ap- 
titude test and the Rorschach in predicting grades, was found to be .73, 
an appreciable rise from the original correlation of .64 between aptitude 
and grades. 

A shortened scoring, using only 13 of the 20 non-intellective items, 
gave a slightly lower correlation of .34 with grades, and .07 with aptitude; 

- it was therefore deemed advisable, at least for this particular group, to 
retain the original 20 point scoring. 

Several interesting clusters of Rorschach patterns seem to appear 
upon the examination of the 20 non-intellectual variables. The first of 
these is the tendency for the achieving students to concentrate their re- 
cords into fewer responses than the non-achievers. They tend to have 
fewer chromatic responses, fewer achromatic responses, and fewer content 
categories. They make less use of the large detail areas ordinarily аб- 
cepted as indicating a common-sense, matter-of-fact approach. These 


ж 


College Grades and Group Rorschach 405 


details are smaller, in relation to the total record, both absolutely 
in terms of ratio. 


Table 4 
Non-Intellectual Rorschach Items 


Grade Aptitude 
Quartiles Quartiles 


Q Qi QQ Rorschach Item 
24- 8 20-13 D fewer than 18* 
14- 3 9-7 Dd 0 or 1" 
20-12 19-14 D% under 65% 
5-2 2-2 P more than 8* 
21-14 19-17 8+9+10% under 35% 

8-4 5- 5 A 4-H 9, between 41% and 54%* 
25-21 23-23 A--H equal to, or more than, 2(Ad--Hd)* 
21-15 18-15 Achr R fewer than 13 

8-1 7-4 Chr R fewer than 10* 

25-21 16-20 Cont. Cat. fewer than 12* 

20-20 18-20 An, Sex present 

16-14 15-20 FV absent* 

30-23 25-27 C sum less than 4.5* 

28-16 24-15 C sum less than M 

11-10 9-11 Y more than 3 

5- 6 2-10 M not in III* 

17-17 15-18 P not in VIII 

15- 8 10- 7 C not stated as determinant* 
20- 8. 18-13 Achr 2% more than 39%* 
12- 6 10- 8 FV-+Y equal to 0 or 1* 


* Items retained in shortened scoring. 


_ Further, there appears to be a shying away from color. It is a better 
Sign for grade-getting not even to mention color as a determinant. The 
Proportion of responses on the last three colored cards is smaller for the 
achieving group, and there are more of them that have a color sum less 

four and one-half. More also show an M:C (human movement to 
Amount of color) ratio balanced on the side of human movement, in 
theory a more introverted pattern. 

Not only does there seem to be a shying away from color, but ap- 
Parently a correspondingly greater interest of the achievers in the achro- 
Matic cards, They use shading to a slightly greater degree (Y more 
than 3), They organize the material more adequately on the dark cards 
(Ach Z% more than 39%). Evidently there are more of them in each 
extreme of the use of shading, however, since it is also a favorable sign 
“Sr both perspective and texture responses to be omitted altogether. 


406 Grace M. Thompson 


There is a slight amount of evidence to suggest that the achievers 
are more conforming; at least they use more popular responses in general, 
although they react to a slightly lower degree to the two popular M 
responses in Cards П and III than might be expected. 

The remaining discriminating items do not seem to fit a recognizable 
pattern. The probability that the achievers might attend less to the 
insignificant details (Dd) is in the direction of what the conventional 
Rorschach interpretation might lead one to expect. Similarly the ratio 
of human and animal wholes to human and animal detail responses is not 
a surprising finding, since a healthy superiority of the former is generally 
considered advisable. Why the presence of anatomy and sex responses 
should have no apparent relationship to academic adjustment and only 
a slightly negative one to aptitude is somewhat more surprising, especially 
in view of the fact that Harrower-Erickson has listed their presence as 
one of the least desirable of the Rorschach items discriminating between 
psychoneurotics and normals. 


Summary 


The Group Rorschach test, when administered to a class of 128 college 
students, was found to be a valid predictor of the adjustment or motiva- 
tional aspects of grades, to the extent that a quantified scoring of the test 
papers gave a correlation of .38 with the criterion of semester psychology 
grades, and an т of only .04 with a measure of verbal aptitude. 

It is suggested, therefore, that the Group Rorschach may eventually 
prove useful as a large scale, practical, and objective tool for the measure- 
ment of those factors influencing grades which are not purely intelligence 
factors in the sense that they are capable of measurement by our standard 
aptitude tests. It will remain to be seen, of course, whether these same 
items—and undoubtedly, not all of them—will remain valid under other 
conditions and in other college groups. Cross-validation of the Ror- 
schach factors here isolated should be undertaken on further groups before 
any conclusive diagnostie weighting could be assigned to any of them. 
It would be expected that some loss in predictive value contributed by 
those particular items would occur in the process of cross-validation; an! 
only actual practice could demonstrate what extent the relationship 
described here between the Rorschach and academic achievement woul 
be verified on repetition. Several comments, however, appear justified 
here: first, that the Group Rorschach can be quantified and still retain 
diagnostic value—a finding which would corroborate that of Munroe an 
others. The advantages of quantification and the group method of ad- 
ministration would appear to lie not only in the time of administration, 
one hour for a whole group, but also in the scoring, subjective elements 


College Grades and Group Rorschach 407 


being minimized by adherence to a predetermined set of categories which 
yielded ап objective scoring that any other experimenter could apply 
equally well. Median scoring time in the present study was approxi- 
mately half an hour per record, and might be expected to vary depending 
on the number of Rorschach factors investigated, experience of scorer, etc. 
In the event that objective scoring and interpretation could prove prac- 
tical on a large scale, it would also seem probable that the present strict 
requirements for qualified Rorschach scorers could be lessened somewhat. 

Finally, then, it would appear that the Group Rorschach could be 
used in the prediction of academic success above and beyond the predic- 
tion offered by а standardized intelligence test, and it is to be hoped that 
further research will expand the practical use of the method. 


Received November 6, 1947. 


References 


1. Beck, S. J. Rorschach’s test: I. Basic processes. New York: Grune and Stratton. 
1944. 

2. Klopfer, B., and Kelley, D. The Rorschach technique. Yonkers-on-Hudson: World 
Book Company. 1942. 

3. Montalto, F. An application of the Group Rorschach Technique to the problem of | 
achievement in college. J. clin. Psychol., 1946, 2, 254-260. 

4. Munroe, R. Adjustment and academic performance of college students. Appl. 
Psychol. Monogr., Stanford University Press, Stanford University, 1945. 


A Follow-up Study of Personal Counseling Versus 
Counseling by Letter * 


C. Harold Stone and Irving Simos 
University of Minnesota 


'The widespread vocational counseling of returned veterans of World 
War II by the Veterans Administration, by other publie and private 
agencies, and by universities and colleges has brought to the fore 
the question of methods of reporting to the counselee information con- 
cerning his aptitudes, abilities, interests, and possible areas for training. 
In numerous instances in the State of Minnesota, veterans who have 
taken advantage of the advisement service offered by the Veterans Ad- 
ministration have requested written reports of test results and counselors 
recommendations. The apparent purpose of such requests has been to 
obtain a report which might be of benefit when applying for employment 

‚ аз well as а written record for future reference. Counseling letters sup- 
plementing the personal interview have been utilized by several VA 
Advisement Centers, and in one instance a follow-up study was made to 
determine what use the veteran made of the report (1). 

In view of this current interest in the use of written reports in coun- 
seling, results of a study conducted by the Employment Stabilization 
Research Institute in 1942 have been summarized for presentation with 
the thought that they may be of general interest. 

During the late fall of 1941 and early winter of 1942, a ten per cent 
sample, totalling 415 unemployed persons, was selected randomly from 
registrants filing for employment in the St. Paul office of the United States 
Employment Service. A comprehensive battery of aptitude, interest, 
ability, and personality tests was administered to the group, personal 
data and occupational history were obtained by interview and clearance 


with the Confidential Social Service Exchange, and reported occupational ў 


histories were verified through personal contacts with local employers 
and letter contacts with out-of-town employers. Careful analyses of а 


* The study reported herein was conducted as a part of the studies of occupational 


competence of unemployed persons in St. Paul under the direction of the first author 

who was supervisor of studies of frictions in the labor market. Dale Yoder and Do! 

G. Paterson were co-directors of the Employment Stabilization Research Institute 

Study of Employment, Unemployment and Relief in St. Paul, 1939-1942. Results 

= улы are published in Local Labor Market Research, University of Minnesota 
ress, 1948. 


408 


Personal Counseling Versus Counseling by Letter 409 


ease data were then made for each person to determine job possibilities in 
relation to existing and projected local employment opportunities. 

— Та reporting analyses of results to the individuals participating, the 
sample was divided randomly into two groups with persons in one group 
being counseled individually by the staff counselor’ and those in the other 
group receiving written reports in the form of a “counseling letter." 

- During the conduct of the study, 214 cases were counseled personally and 
201 received counseling letters. The follow-up study reported herein 
includes 196 of the personally counseled cases and 184 cases who received 
counseling letters.’ 

The counseling interview was normally about one hour in length, and 

| after discussions of work history, previous training, test results, and other 

factors related to the occupational adjustment of the individual, specific 

plans of action were worked out by the counselee. More than one inter- 

view. was required in many instances in order to aid the counselee to re- 
solve more adequately his problems of occupational adjustemnt. 

А standard outline was established as а framework for the counseling 
"letter sent to those who did not receive the benefits of personal counseling. 
A summary of test results stated in general terms was included along 
with specific recommendations for suitable employment and training 
Possibilities. A sample from the Institute files is shown on following page 
to indicate the form and general nature of the letters. 

In order to discover whether reactions to counseling by personal in- 
terview differed significantly from those to counseling by letter, and 
further, to obtain indications of the effectiveness of both methods in 
aiding the unemployed in job seeking and in improving their morale and 
self-confidence, a follow-up study was conducted in July, 1942 (ap- 
Proximately six months after the testing and counseling). 

The follow-up study was conducted by the mailed questionnaire 
method. · Equivalent questionnaire forms were mailed to members of the 
counseled and letter groups. Following Toops’ method of using follow- 
Up letters to obtain maximal returns (2, 3), three follow-up letters to the 
Questionnaire were used to reach a percentage of returns deemed adequate. 

able 1 shows the per cent of questionnaire returns received from the two 

л. 4 Мо significant sex differences were found in the percentage of 

Urns in either group. Returns, however, from Counseled cases (85%) 

U * Vivian J. Humphrey, now Senior Student Counselor, Student Counseling Bureau, 

* *rsity of Minnesota. 
en of the Counseled group (16 full-time employed and 2 incomplete) and 

i ten of the Letter group (16 full-time employed and 1 incomplete) are excluded 

random iil The inclusion of employed persons in the total sample resulted from the 

od of selection of employment office registrants. 


410 C. Harold Stone and Irving Simos 


Иг. J. 7. T, 
St, Paul, Minnesota 


Dear Иг. 7.: 


The following is а report of the results of the interviews and tests which you 
took at the Employmont Research Center, We hope that this information will be 
helpful to you in seeking work ог in preparing yourself for future employment 

by pointing out а number of job possibilities for which you seem to be fitted, 


Work Тог Which You Appear Qualified By Experience And Training 


On the basis of your work experience alone, your best immediate job opportunities 
would appear to be in work such as wrapper and packer or in the operation of some 
factory machines, You could also qualify es a painter's helper, plumber's helper, 
or possibly ав а truck driver, 


Job Opportunities Open To You On The Basis Of Your Measured Capacities 


An analysis of the results of your tests indicates that you have excellent mechani- 
cal ability and superior ability to work at jobs requiring the rapid and accurate 
use of your fingers and emall tools, Your clerical ability is only fair, and it 
1s not advised that you seek training for, nor employment in office work, It is 
also recommended that you do not attempt work as a salesman, Your interests are 
sinilar to those of men who are successful skilled tradesnen, аз for example, men 
who work as painters, carpenters, vrinters, and machinists. These results indicate 
that you should be successful as а semi-ekilled worker in а factory or working at 
mochines which would not involve a long training period, The suggestions made in 
the preceding paragraph ere further indicated by the test results. 


Work For Which You Can Qualify If You Secure Additional Training 


It is strongly recommended that you sécure additional training in some trade. то 
should investigate the possibility of taking courses at the St, Paul Vocational 
School, Since there is considerable demand for these courses, however, you may 
find that you can obtain training more quickly in а reliable private trede school 
such as Dunwoody Institute in Minneapolis, Your excellent mechanical ability 
indicates very strongly thst you should secure training in some trade, such ав 
machinist, mechanic, plumber, or in some other mechanical trade in which you may 
have 2 special interest. | 


Use Of This Letter 
You may use this letter when applying for & job if you wish your prospective e 
enployer to know of our recommendations, If a prospective employer 15 interest 


in obtaining additional interpretations of the interviews and tests, «wc shell be 
glad to supply further information at his request. 


Very truly yours, 
(signed) 


Member of Research Staff 


Fig. 1. Sample of Counseling Letter. 


were significantly greater than those from Letter cases (7470). Total 
return of all questionnaires sent was 80 per cent. i 
Table 2 shows in percentages the responses to items in the question- 
naire considered relevant to this report. Inspection of the table reveals 
з Questions included in the questionnaire and not summarized here аге as follows: 
By what firm are you employed (И employed at present)? What is the title of yout 


| 


| 
| 


Personal Counseling Versus Counseling by Letter 411 


Table 1 


Returns of Follow-up Questionnaires Received from Unemployed Counseled by 
Personal Interview and Unemployed Counseled by Letter 


Male Female Total 
Sent Returned Sent Returned Sent Returned 
N N % N М '% NATIN” sy 
Counseled Cases 119 103 86 77 64 83 196 167 85 
Letter Cases 14 86 75 70 51 73 184 137 74 
Total Cases 223 189 81 147 115 78 380 304 80 
Critical Ratio 
(% Counseled vs. 
% Letter returns) 1.46 2.12 2.07 
P 144 .034 .008 


an unusual consistency of response between the Counseled cases and the 
Letter cases in the majority of instances. It had been expected that 
much wider differences would be found favoring the Counseled cases. 

Areas of close agreement‘ between males counseled by personal inter- 
view and those counseled by letter may be summarized by question num- 
ber as follows: 


_ 1. Employed since report of test results received; 2. Employed at 
time of follow-up; 3. Job satisfaction; 5. Report of test results helped 
decide type of job to seek; 6. Report of test results disclosed latent abili- 
ties; 7. Self confidence increased; and 9. Employment service applicants 
should have opportunity to take tests. 

Study of the actual responses to questions 3, 5, 6, 7, and 8 indicates 
quite clearly that a high percentage of both males and females placed a 

igh value on the effectiveness for them of the testing and counseling 

Program. 

The widest discrepancy between the two male groups appeared in 
Question 4. Two-thirds of the men counseled by personal interview felt 
that the discussions with the counselor helped them in subsequently 
talking to employers. Less than half of the men counseled by letter, 

owever, found the written report helpful when talking to employers. 

е difference is statistically significant at the 1 per cent level. Similar 
n Exactly what do you do on your present job? How much do you earn per hour— 
Me "базу Since receiving a report of your tests, to what firms have you applied and 
of the е of work? Have you taken any training course since receiving а report 

: results? Will you please give us any suggestions you may have for improving 

Уре of service? 

ifferences in percentages not statistically significant, Р > .05. 


412 C. Harold Stone and Irving Simos 


Comparison of Responses to Follow-up Questionnaire by Those Counseled by 
Personal Interview and by Those Counseled by Letter 


Note: Responses by sex and for the total group are shown in percentages, roun 


Table 2 


off to yield 100% for each group. № are shown in Table 1. 


Counseled: Have you been employed since you discussed your test results with 


Letter: Have you been employed since you received a report of your test res 
Total - 


Female 


Yes 
% 


69 
56 
63 


Хо 
% 
31 
44 
37 


Counseled and Letter: Are you employed at present? 


Female 
Yes No 
% o 
61 39 
54 46 
58 42 


Yes 
% 
82 
78 
80 


Total 


Yes 
% 


75 
72 
74 


Counseled and Letter: Do you like the type of work you are now doing? 


Question 1. 
job counselor? 
Male 
Yes No 
o 
Counseled Cases 92 8 
Letter Cases 98. - i 
"Total Cases 2 8 
Question 2. 
Male 
Yes No 
% +% 
Counseled Cases 8 15 
Letter Cases 85 15 
Total Cases 85 15 
Question 3. 
Male 
Yes No 
% % 
Counseled Cases 70 2 
Letter Cases 76 22 
Total Cases 73. % 


Question 4. 


Counseled: Do you feel the opportunity you had to talk over your 


eww] so 


Female 
Yes No ? 
СЛАНЕ 
аери 
81 19 
Ду У 8) .8 


the counselor helped you in talking to employers? У 
Letter: Did you find the letter which was sent to you giving а written 


your test results helpful when talking to employers? 
Male 


Yes 

% 
Counseled Cases 66 
Letter Cases 44 


57 


No 
% 
28 
46 
35 


Female 
Yes No 1 
% o % 
69 23 8 
42 50 8 
59 3 8 


Total 


Yes No 

95 Tom 
73 24 

73 21 

75 23 | 
test results with 


Total 4 


> 


b 
eport í 
3 


Personal Counseling Versus Counseling by Letter 413 


Table 2—Continued 
ion б. 
Counseled: Did the discussion of the test results help you decide what kind of job 


to look for? 
Letter: Did the report of the test results help you decide what kind of job to look 
for? 
Eo Anno 


Male Female Total 
Yes No ? Yes No ? Yes No ? 
% % 96. 96 8% h. % 99 
Counseled Cases 67 30 3 68 32 0 68 31 1 
Letter Cases 63 37 0 64 36 0 63 37 0 
Total Cases 65 33 2 66 34 0 66 33 1 
Question 6. 
Counseled: Did an understanding of your test results disclose abilities which you 
did not know you had? 
Letter: Did the report of your test results disclose abilities which you did not know 
you had? 
E Uv occ 
Male Female Total 
Yes No ? Ye No 7? ye No ? 
% % % о % o % % % 
Counseled Cases [ESOS e 31 0 ГТ Б Р 
Letter Cases 57 43 0 5 43 0 58. "АЕ 8 
Total Cases БЕТА && 36 0 60 39 1 
Question 7. 
Counseled: Did your discussion with the counselor result in increased confidence 
in yourself? у 
Letter: Did the written report result in giving you increased confidence in yourself? 
Male Female · Total 
Yes No ' Ye No Yes Мо 
Lem 9, % 9$ 9 
Counseled Cases 83 17 82 18 82 18 
Letter Cases 70 30 т 23 з 27 
Total Cases 77 23 80 P S 78 22 


Боа Сава Be НИМ аа 
Question 8. 5 
Counseled: In general, do you believe that taking the tests and discussing the 
results with the counselor were helpful to you? a $ 
Letter: In general, do you believe that taking the tests and receiving the written 


Male Female Total 
èd N ? Yes No Ye No ? 
E _ % % h CA % h h 
Counseled Cases 84 1577 
96 Ад 833 17 
letter Cases 7 2 70 30 та 20 


Total Cases si 7038061 T9 7 (0n 


414 C. Harold Stone and Irving Simos 


Table 2— Continued 


Question 9. 
Counseled and Letter: Do you think that applicants at the Employment Office 
should be given tests? 
Male Female Total 
T To To 
To Those To Those To Those 
Every- Who To Every- Who То Every- Who То 
one Ask None one Ask None ? one Ask None ? 
Tos, 19005 90 90 % ә 9 % .- 90. “Sa 
Counseled Cases 54 46 47 45 6 2 51 46 2. + 
Letter Cases 50 47 3 49 49 2 49 48 3 
"Total Cases 52 46 2 48 47 5 52 45 3 


differences for the women were found. However, it must be noted that 
these two questions are somewhat different in wording and implications 
and a direct comparison of them may not be logically justified. 

Although the majority of both groups responded favorably to question 
8, concerning helpfulness of reports of test results, the difference in favor 
of those individuals counseled personally is statistically significant. 


Summary 


In summary, it seems clear that both groups were favorably impressed 
with the testing and counseling service. The high percentage of returns 
indicates that rapport had been well established, especially so for those 
personally counselled. The differences between the responses of the two 
groups are in general not statistically reliable, though what difference 
there is favors the personally counselled group. The use of counseling 
letters, however, is clearly shown to be an effective means of reporting 
results. 

Unfortunately, a third procedure was not utilized, namely personal 
counseling plus a summary letter to be retained by the counselee. Had 
this combined procedure been used, it seems reasonable to believe that a 
still higher percentage of favorable reactions to the counseling service 
might have resulted. 

Received January 15, 1948. 
References 
1. Iverson, Ralph, and Harris, John, “The Part VIII Summary Letter as a Counseling 
Technique,” Information Memorandum No. 1, Jan. 10, 1947, Voc. and Rehab. 
Divn., Veterans Admin, Regional Office No. 3035, Minneapolis, Minnesota 
(Mimeographed). 


2. Toops, Н. A. The returns from follow-up letters to questionnaires. J. appl. PsV- 
chol., 1926, 10, 92-101. 


3. Toops, Н. A. Validating the questionnaire method. J. Person. Res., 1928, 2, 
153-169. 


The Psychogalvanometric Method for Measuring the 
Effectiveness of Advertising * 


Gordon Eckstrand and A. R. Gilliland 
Northwestern. University 


Advertisers have long been searching for objective techniques or 
methods of pre-testing advertising material which is inexpensive, fast and 
reasonably valid. That is, a technique or method of predicting, in ad- 
vance of use in an advertising campaign, the effectiveness of certain ad- 
vertising material as judged by a criterion of volume of sales induced. 

Whether an advertisement is a good one or not can only be determined, 
in the last analysis, by running the ad as scheduled and then observing 
the effect on sales exclusive of other factors. The buying public is, after 
all, the final judge. But this is an expensive method of operating con- 
sidering both time and money, since it does not permit the weeding out 
of poor ads before they are put before the public as part of an advertising 
campaign. In 1946 more than two billion dollars was spent for all kinds 
of advertising. With this great amount of money being spent, it is im- 
portant for advertisers to get as much as possible out of each advertising 
dollar. Thus the pre-testing of advertisements is of great economic in- 
terest as well as an interesting problem in the prediction of human 
behavior. - 

In an attempt to get some idea of what to expect from an advertising 
Appeal in advance of its actual use on the publie, and in an effort to 
determine what factors go toward making good and poor ads, advertisers 
have developed several techniques for testing their material Experts’ 
judgments, cross sections of publie opinion, point rating systems, memory 
for ads, point-of-purchase sales tests, and split runs in media of limited 
Circulation have all been used to test advertising material. However, 
Some of these techniques have shown little validity, and others are time 
Consuming and costly. Consequently the field of advertising is still 
looking for а valid and rapid method of measuring the effectiveness of 
Advertising matter, 

It is the purpose of this research to investigate the usefulness of the 
Psychogalvanic response as a measure for use in predicting the effective- 
Ress of advertising material as measured by a sales test criterion. 

Е ‚* The authors are indebted to Mr. G. Maxwell Ule, Director of Research, McCann- 
Tickson, Inc., Chicago, Ш., for permission to use the ads and appeals used in this study 
And for the sales test results used as a criterion in this study. 
415 


416 "Gordon Eckstrand and А. R. Gilliland 


For a good many years after its discovery as a psychological measuring 
tool in 1888, the psychogalvanic phenomena enjoyed almost unbelievable 
popularity in psychological research. It has been studied with reference 
to everything from attitude (1) to the effect of cobra venom (7). How- 
ever, when we turn to a consideration of the psychological correlates of the 
psychogalvanic response we find little agreement among investigators. 
At various times and by various investigators, the psychogalvanic re- 
sponse has been claimed as a measure of emotion, conation, attitude, at- 
tention, level of consciousness, and many others (5). 

Landis and Hunt (6) have pointed out that the galvanic response is 
not “а measure of, regular criterion of, or indicator of, any one or a com- 
bination of these traditional psychological categories.” However, as 
both Landis and Darrow (3) have agreed, it seems to be a fairly certain 
method of demonstration of general autonomic activity. 

It seems fairly well established, then, that while many stimuli and 
stimulus situations may serve to elicite the psychogalvanic response, the 
response seems to be a good measure of the amount of general bodily 
arousal present at any time or during any portion of behavior. It seems 
equally well established that the psychogalvanic response is not a valid 
and reliable measure of any of the traditional psychological categories. 
This does not necessarily mean, however, that the psychogalvanie re- 
sponse will not be of value in predicting certain more complex types of 
responses. It may be that in a response as complex as a person's reaction 
to an advertisement, several or many of the psychological conditions 
mentioned above may be present and affecting behavior. It is this total 
response to the situation, this total amount of arousal in which we are 
interested. The psychogalvanometer seems well suited to measure this 
total arousal. 

There has been very little work done using the galvanometer to test 
advertising material. However, some evidence has accumulated to indi- 
cate that the changes in skin resistance of selected samples of subjects 
exposed to advertising material may be of value in predicting the later 
effectiveness of that material. Ruckmick (8) conducted a study in which 
the responses of the sweat glands were recorded during a three second 
exposure of advertising copy. Several series of copy, run with twenty 
subjects, revealed an internal consistancy of data and also gave results 
which tallied in a general way with the choices obtained by the serial 
procedure of impression. 

Conrad (2)! conducted an investigation to determine whether it was 
possible to study the responses made to advertising appeals of car cards 
by means of a psychogalvanic response apparatus. Using a Hathaway 


! This investigation was done under the direction of Dr. A. В. Gilliland. 


Measuring Effectiveness of Advertising 417 


galvanometer, he exposed a series of car cards to a large group of subjects 
for five seconds each. The subjects used were college students and the 
eards were presented in a counterbalanced order. He later had the sub- 
jects rank the ads as to their effectiveness in getting attention. He found 
that the results obtained in this manner correlated only .18 with the re- 
sults obtained by the galvanometer. He did find, however, that definite 
galvanic responses could be obtained with advertising material as stimuli, 
and that certain material got larger responses than other material. 
Gilliland and Sharp (4) showed that the psychogalvanometer does 
record variations in the effect of advertising on readers. They did not 
"attempt, however, to correlate the size of the subjects’ galvanic reactions 
with the effectiveness of the ads as determined by an outside criterion. 
They pointed out the need for using the psychogalvanometer to test ads 
that had already been evaluated as to selling effectiveness in order to 
_ establish the validity of the method. 
In these earlier studies the technique has not been subjected to a 
- rigid experimental test where a suitable subject group was used and where 
the method was validated against a suitable objective criterion. The 
- few studies reported here have used either no criterion of the ads’ effective- 
ness or have used only the subjects’ opinion. This is, at best, only a 
criterion of very limited value. The best, most direct, and most objective 
"criterion readily available is some measure of the ads’ actual selling effect- 
iveness in a realistic advertising situation. It is the purpose of this 
research to test the hypothesis that effective advertising material, as 
judged by a sales criterion, will, on the average, induce larger psycho- 
galvanic responses in a selected sample of the population than will less 
effective advertising material. 


The Experiment 


| Subjects. The material tested dealt with three popular, nationally 
advertised food products made by the same company. An attempt was 
made to obtain a subject sample which would approximate a sample from 
the population to which the ads were directed. Since the material dealt 
_ With nationally advertised products, the sample used falls short on one 
count immediately. ‘The sample used had to be drawn from the area in 
and around Evanston, Ill. Evanston and the surrounding area cannot 
be considered a representative section of the country, but the sample 
drawn from this area seems more representative of the country at large 
_ than it does of the Evanston area. 
Since the material dealt with in this study was concerned with basic 
_ food products, the sample was made up of married women or single women 
_ Who cook and purchase groceries. А few women were included who were 


418 Gordon Eckstrand and A. R. Gilliland 


engaged to be married and thus will soon be part of the potential buyers 
of these products. An attempt was made to get a distribution of subjects 
from the various income and age groups and a distribution of subjects 
with and without children. Due to the difficulty of obtaining subjects, 
no attempt was made to match local or national statistics on these factors. 
Table 1 presents the number of subjects falling in each of the categories. 


Table 1 
Analysis of the Subject Group 
Income Number Age Number Children Number 
Below $3,000 18 Below 24 15 No children 29 
$3,000-$5,000 16 25-39 18 Children 19 
Above $5,000 14 40-54 11 
Ађоуе 55 4 


Ads and Appeals Used. Three series of advertising material were 
tested. Two of the series consisted of advertising appeals made up into 
finished advertisements and the third series was composed of advertising 
appeals in verbal form not yet made up into ads. 

Series 1 consisted of three finished ads of pancake flour. Each ad 
was 11" by 814" and was done in black and white. With respect to all 
variables but basic appeal the ads were quite similar. They contained 
about equal amounts of pictorial illustration, headlines of approximately 
equal length, about the same amount of copy, and the brand name was 
used equally often. .Series 2 consisted of two finished ads dealing with 
a baby food. Each ad was 16” by 9" and was done in black and white. 
Again the ads were quite similar with respect to all variables but basic 
appeal. All the finished ads were mounted on stiff, white cardboard. 
Series 3 was made up of four advertising appeals of themes of a popular 
brand of flour. These were basic themes or ideas which might later be 
used as a basis for the formulation of finished ads. Since the sales test 
to be used as a criterion was made with verbal presentation of the appeals, 
it was decided to record the appeals so that they could be presented to the 
subjects in a similar manner. The appeals were recorded by an an- 
nouncer with radio experience. The announcer was told to make each 
presentation as constant as possible. He was informed as to the nature 
of the experiment and told that we were interested in measuring the 
effectiveness of the basic theme or idea contained in the appeal and did 
not want effectiveness to vary as a function of the different qualities of 
his presentation. It is not possible to ascertain how well this purpose 
was accomplished, but of the several persons who have listened to the 
presentations, none have detected any bias in favor of any one appeal. 


Measuring Effectiveness of Advertising 419 


Procedure. Two rooms were used in this investigation. One room 
was used for the presentation of the advertising material to the subject, 
and the adjoining room contained the equipment for recording the psy- 
chogalvanic responses and the equipment for playing the recorded mater- 
ial. One experimenter was in the room with the subjects and gave in- 
structions and presented the material. The other experimenter was in 
the adjoining room and controlled the recording apparatus. The ex- 
perimenters were in contact with each other by means of а two-way 
signal system. 

The room in which the subject was seated was bare of distracting in- 
fluences as far as this was possible. The room was semi-sound-proofed, 
and although it did not keep out all sounds, it reduced the extraneous 
noises to а minimum. АП daylight was excluded and the room was 
lighted by electric lights so that the light on the ads would be constant. 
The ads were presented on a stand which was adjustable for height and 
distance and were presented at eye level. A blank piece of white card- 
board covered the first ad and a similar piece separated each of the fol- 
lowing ads so that the experimenter could control the rate of presentation. 

When a subject arrived she was brought into the room, and the elec- 
trodes were fastened to her palm and arm. As most people have а 
distinct aversion to being shocked by an electric current, this disturbing 
influence was removed as far as possible by telling the subject that there 
was absolutely no danger of being shocked. The subject was told to sit 
relaxed and that all that was required of her was to look at and listen to 
the material as it was presented. She was told to look at the ads as if 
she were seeing them in a newspaper or magazine and to listen to the 
appeals as if she were hearing them over the radio or someone was saying 
them to her. 

Within any series, the ads and appeals were presented in a counter- 
balanced order, and the presentation of the series themselves was also 
counterbalanced. This procedure controlled position effects and the inter 
and intra series influences of an ad or appeal on another. 

The subject was allowed to relax for a period of three to five minutes 
after the completion of the instructions in order for her to get used to 
the situation. This tended to make the galvanic readings more stable. 
At a signal from the experimenter running the apparatus, the other ex- 
Perimenter removed the first blank card thus exposing the first ad. In 
order to accustom the subject to this procedure, the first printed adver- 
tisement and recorded appeal were always “dummies” during which time 
No readings were taken. This also tended to make the galvanic readings 
more stable. The ads were presented for a 30 second period while the 
appeals lasted about 15 seconds. Between 30 and 45 seconds was allowed 


420 Gordon Eckstrand апа А. R. Gilliland 


between the presentation of the ads and appeals within a series and be- 
tween 45 and 60 seconds was allowed between each series. This interval . 
depended upon the stability of the psychogalvanie readings at the time, 
The beginning and end of each exposure period was marked on the tape 
recording of the subject’s responses. 

Apparatus. The apparatus used in obtaining the galvanic readings 
was a two stage vacuum tube voltage amplifier with direct coupling. It 
was designed specifically for this type of research and this type of measure- 
ment. The apparatus has the advantage of ease of manipulation, ac- 
curacy in giving quantitative comparisons, and high sensitivity. An 
additional advantage was the obtaining of permanent records by graphi- 
cally recording the psychogalvanic responses by means of an Esterline 
Angus graphic recorder model A. W. 

Zinc electrodes about one inch in diameter were used. These were 
attached by means of leather straps and sponge rubber between the 
electrode and the strap assured an even contact with the skin area. 
One electrode was attached to the palmar surface of the hand and the 
other to the inner surface of the forearm. Commercial electrode paste 
and jelly were used to facilitate contact with the subject’s skin area. 

The graphic chart of the recorder moved with a speed of three inches 
per minute and the magnified changes in the subject circuit were recorded 
on the moving chart by means of a writing mechanism. The machine 
was calibrated with a decade resistance box so that the recorded responses 

_could be read off as changes in subject resistance. 

Criterion? The criterion used in all three series of ads and appeals 
was the results from sales tests conducted by the McCann-Erickson Ad- 
vertising Agency in Chicago. The purpose of these sales tests was in 
each case to analyze the relative sales effectiveness of the ads and appeals 
in question. ; 

Тће tests, in each case, were made through a study of the movement 
of store inventories associated with consumer exposure to the alternative 
advertising material studied. The studies were all conducted using 
stores situated in what were believed to be representative urban com- 
munities. In the consumer exposure to the various advertising materials, 
strict counterbalancing techniques were used. This tended to control the 
effect of random factors, biases from the cumulative impact of advertising 
exposure, and from the sequence of presentation of the various appeals. 

In all of the tests strict and rigid controls were used, therefore, since 
advertising was the major variable in the stores during the test, it 18 
reasonable to assume that, the differences in sales, revealed by the store 
inventories, was the result of advertising. 


? A more complete description of the criterion tests cannot be given due to the con- 
fidential nature of the techniques. 


Measuring Effectiveness of Advertising 421 


Of the many possible ways of evaluating the changes in resistance, 
only one was used in this study—the total log conductance change during 
the exposure to any ad. That is, the log conductance changes for each 
ad was summated giving a total arousal value. However no change was 
recorded unless there was at least 200 ohms of change and no differences 
between ads were recorded unless the change was 10% or greater. 


Results 


'The problem of this study was the relationship between the total 
arousal produced by the ad and its sales effectivness. If two ads had 
equal arousal they would produce equal log conductance changes or one 
would be greater in half of the cases and the other would be greater in the 
other half. Any variation from this one-to-one relationship could reason- 
ably be attributed to the greater efficiency of one appeal over the other.’ 
The significance of any deviation from this ratio can be checked by the 
Chi square method. Table 2 gives the number of times each ad in each 
of the three series produced the largest arousal value and the Chi square 
values for these differences. 

From Table 2 we can examine each of the three series of ads. In the 
pancake flour ads it is apparent that ad A gave more high arousals than 
ай В. The chi square value of 3.26 would occur by chance not more than 
about seven times out of 100. The chi square of 3.78 between A and C 
would occur not more than about five times out of 100. The difference 
between B and C was insignificant. In the baby food series there were 
по significant differences between the two ads. There were likewise no 
significant differences in the flour ad appeals. 

These same data for the arousal value of the three series of ads were 
analyzed by another method. The smallest log conductance obtained 
for each subject was arbitrarily given a value of 0 and the highest value 
obtained a value of 10; other values were distributed between these ех- 
tremes. Table 2 gives the means for each ad in each series by this 
method. 

The difference between these means were checked for significance by 
the Fisher “t” test. Table 4 gives the “t” value for each comparison 
for each of the three series of ads. j 

The “t” score between ads A and B for the pancake flour ads was 
1.60. This means that if no difference in effectiveness existed between 

* The authors are aware that other assumptions can be made about the distribution 
of the expected frequencies and the treatment of the cases in which no differences were 
found in galvanometric readings between the ads in a series. However, any method of 


Calculation would give similar results and the method here used seems as defensible 
as any, 


Gordon Eckstrand and A. R. Gilliland 


422 


£6 

ёт eI 

70 ёт £I 
26 

ort 91 

ce зї £I 
61 

Ir п 

00 TI п 
9% 

сї [4! 

00" а ст 
те 

TZI YT 

98' а" Hu 
£c 


eisnbs Aouonborg Aouonbaiy 
що рәјәәйхӱя _ ролловао 


M 


ло 


00 
816 
бт 16 
Ра €'6I 81 9c€ 
6 
axenbs _ ÁAouonboig  Áouonboijg arenbs 
no pejoedxy peA1e8qO n 
вроод Aqua, 


81 81 
SI 81 
or 
61 £I 
61 сс 
8 
езт £I 
T'SI те 
L 


Kouonbeig  ÁAouenboig 
poyedxg рәлләѕд0) 


тој exeouvq 


m 1. елес зс ==; = ——————_—— 


——————— 


spy Jo вәшос волу, 30 ввәпәлтуәәр] Ur Soom 
с AQEL 


Measuring Effectiveness of Advertising 


Table 3 
Mean Reactions for Each Ad 
Pancake Flour Baby Food Flour 
Ad N Mean N Mean N ean 
A 46 5.50 48 3.73 48 1.31 
B 46 4.03 48 4.52 48 1.49 
с 46 3.98 ` 48 1.28 
D 48 1.68 


the ads a “t” as large as this and in the same direction would be obtained 
in only about seven times out of 100 by chance. The “t” value of .06 
between B and C was insignificant. 
Table 4 
“2? Test for Significance of Difference 


Pancake Flour Baby Food Flour 
Comparison t t t 
A-B 1.60 .90 33 
A-C 1.53 .06 
B-C .06 48 
A-D .58 
B-D 87 
C-D Л8 


The difference between the baby food ads would occur about 17 times 
out of 100 by chance and was therefore on the borderline of probable 
significance. None of the flour ads showed statistically significant, dif- 
ferences, 

Both of the above types of analysis lead to the same general results. 
The results for the two methods can now be compared with the sales 
efficiency of the ads as a measure of the value of the galvanometric method _ 
of testing ads. 

Criterion Data. In the sales test conducted with the pancake flour 
ads, it was found that ad A sold 2.1 times as many packages of flour as 
did either of the other two ads. There was little difference between ad 
В апа ad С. Ad A sold 100 units of flour, ad В 47 units, and ad С 
48 units. 

In the sales test on the baby food ads, no significant difference was 
found in the selling effectiveness of the two ads. Ad A sold 92 units and 
ad B sold 100 units. 

In the sales test conducted using the four advertising appeals or 
hemes, it was concluded that there was a significant difference in the 
Telative sales effectiveness of the four appeals tested. The differences 


424 Gordon Eckstrand and A. R. Gilliland 


were small, however, and the advertising agency concluded, that for - 
practical advertising purposes, the actual degree of difference was so small 
that any of the appeals could be used with about equal effectiveness. 
Appeal A sold 96 units to the people hearing its sales talk, appeal B sold 
100 units, appeal C sold 83 units and appeal D sold 90 units. 


Summary 


Close agreement was found between the galvanic changes produced by 
a series of pancake ads and the sales effectiveness of these ads. The scales 
effectiveness of ad A was 2.1 times as great as either ads B or C. Little 
difference was found between B and C. Both the Chi square method 
and the “?” test indicated that ad A was “better” (galvanie responses) 
than the other two ads at the 7% level of significance. By the method 
described here, no attempt was made to determine how much A exceeded 
B and C. B and C were not significantly different in their galvanie re- 
sponses. 

Тће baby food ads had almost equal sales appeal. In their galvanic 
responses there was no statistically significant difference. 

The results are more equivocal for the four flour ads, although the 
sales tests showed statistically significant differences. These differences, 
however, were small and the advertising agency stated that for practical 
purposes the four appeals could be considered equal. The differences 
between the galvanic responses to these appeals were not statistically 
significant. 

| In conclusion, it may be stated that this study adds positive evidence 
in behalf of the hypothesis that, under properly controlled conditions, the 
effectiveness of advertising material can be predieted by the psycho- 
galvanic method.* Further work is needed, of course, with different types 
of advertising material and with material of different degrees of effective- 
ness. However, the technique gives promise as an objective evaluation 
of ads and advertising appeals. 


Received December 15, 1947. 


References 


1. Abel, Т. М. Attitudes and the galvanic skin reflex. J. exper. Psychol., 1930, 13, 
47-60. 

3; Conrad, W. Е. Е. The effect of advertising on psychogalvanic reactions. Unpub- 
lished thesis. Northwestern University, 1929. 


‘This statement is supported not only by the experimental results reported here 
but also by a series of extensive but less carefully controlled studies briefly summarize? 
in a popular article by Walter P. Wesley, “Results of copy testing by ‘Arousal Method,’ 
Advertising and Selling, Nov. 1947. 


Measuring Effectiveness of Advertising 425 


3. Darrow, C. W. The equation of the galvanic skin reflex curve. 1. The dynamics 
of reaction in relation to the excitation background. J. gen. Psychol., 1937, 16, 
285-309. 

4, Gilliland, A. R., and Sharp, L. H. Unpublished study. 

5. Landis, C. Psychology and the psychogalvanic reflex. Psychol. Rev., 1930, 37, 
381-398. 

6. Landis, C., and Hunt, W. A. The conscious correlates of the galvanic skin response. 
J. exper. Psychol., 1935, 18, 505-529. 

7. Macht, D. L, and Macht, M. B. Effect of cobra venom and alkaloids on the paycho- 
galvanic reflex? Amer. J. Physiol., 1940, 129, 412. 

8. Ruckmick, C. A. The electrodermal response to advertising copy. Psychol. Bull., 
1939, 36, 627. 


Book Reviews 


Adkins, Dorothy C., assisted by Primoff, Ernest S., and McAdoo, Harold 
L. of U. 8. Civil Service Commission, and Bridges, Claude F., and 
Forer, Bertram, of U. 8. War Department. Construction and analysis 

of achievement tests. U. S. Government Printing Office, 1947. Pp. 
292. $1.25. 


In their volume on testing for human skills and capacities important _ 
to the public service Dr. Adkins and her colleagues have not only followed 
the canons of scholarship admirably but have also made the techniques of 
measurement clear for intelligent laymen and reasonably comprehensive 
for the specialist. Directed primarily to achievement as against aptitude 
testing, and to thé prediction of job performance, it is the first volume of 
its kind with chief emphasis on the development of tests by and for public 
personnel agencies. 

Unlike most "practical" books this text is not superficial. Difficult, 
complex topics and techniques are not dodged, if they are necessary toan 
understanding of test development. Rather, they are faced squarely. 
They are, however, elaborated beyond the point necessary to the com- 
prehension of a trained specialist, as will be understandable. "Technical 
terms are defined and explained in the context where they first arise and 
also in a full, detailed glossary. 

Extensive tryout of the materials in training courses has demonstrated 
that persons new to the field of testing can learn, with the aid of this text, 
to apply the concepts and methodologies germane to testing in the public 
service. For this reason, the volume should be invaluable for federal 
committees and boards of examiners functioning for departments of 
government under the policy of decentralized examining. 

College teachers in the field of tests and measurements will find this 
book a valuable adjunct to their reference library or their list of collateral 
reading. Among others to whom it will be useful are college placement 
and testing services, college departments engaged in large-scale examin- 
ing, and industrial concerns with well established or prospective per- 
sonnel testing programs. The tabular and graphical materials, 
oftentimes the text itself, should prove а boon even to the sophisticated 
technician and researcher. 

Although theoretical questions are strictly excluded, adequate dis- 
cussion plus the necessary modus operandi of caleulation are given for 
means, standard deviations, standard errors of differences, tetrachoric 

426 


Book Reviews 427 


and point biserial correlation, and multiple regression. Thirty-four 
tables simplify the machinery of statistical calculation and serve as an 
excellent step-by-step process to orient and inform the newcomer to the 
field and to furnish handy tools for the seasoned researcher. Twenty-two 
figures supplement the tabular material, Twenty-four "exhibits" make 
clear many practical applications of measurement and statistical meth- 
odology to problems of selecting personnel—trades journeymen, clerical 
workers, professionals. 

Dr. Adkins’ text should take a place among the signal and enduring 
contributions to the field. 


Fred 8. Beers 
State Technical Advisory Service, 
Social Security Administration, 
Washington, D. C. 


Crawford, Albert B., and Burnham, Paul S. Forecasting college achieve- 
тет. A survey of aptitude tests for higher education, Part I. General 
considerations in the measurement of academic promise. New Haven: 
Yale University Press, 1946. Pp.291. $3.75. 


This book may be recommended to those interested in student per- 
sonnel work at the college level. It is concerned primarily with guidance 
of students into those fields of study in which they can be most successful. 
The framework of concepts and procedures basic to measurement and 
prediction of special abilities is presented in such a way as to be useful not 
only to technicians but also to administrators and faculty members in 
general. 

Тће book opens with an historical survey, elementary to the psycho- 
logist, but instructive to those in other fields. It includes clear definition 
of such concepts as aptitude, skill, and achievement, with examples to 
show how the tests operate. The difficulties inherent in aptitude testing 
are clearly presented, and theoretical methods of attacking the problems 
are suggested. 

Chapter two is a review of statistical principles involved in test work. 
It has value as an indication of the practical function of the statistical 
methods ordinarily taught in tests and measurement courses. The ma- 
terial quite naturally constitutes an argument in favor of advanced 
Statistical courses as well. Y 

Chapter three contrasts the so-called “general intelligence test” with 
tests which are intended to measure several more or less independent 
capacities, Several of the widely-used tests for adults and college stu- 
dents are described, discussed, and criticized. One can agree with most 
of the criticism directed against the few tests available for use at the 
college level. 


428 Book Reviews 


Chapter four is à review of achievement testing, indicating a trend 
toward the measurement of higher and more complex thought processes. 
In comparing essay-type and objective examinations, attention is given 
to the means for eliminating or minimizing the alleged weaknesses of the 
latter. The discussion is centered around a few well-known, large-scale 
testing projects which have provided instruments for use with college 
students. Included are basic facts concerning the degree of success 
achieved by present methods for predicting college grades. 

Chapter five presents as a sample aptitude battery certain tests used in 
studies at Yale University. Methods, techniques, and results achieved 
in differential prediction of success in the liberal arts, in pure science, and 
in applied science, are discussed. The data are valuable, and the theo- 
retical implications are significant. 

Chapter six is a discussion of the theory of factorial analysis, and a 
presentation of some results secured by such factoring methods. Em- 
phasis is placed upon the Thurstone studies of Primary Mental Abilities. 
Crawford and Burnham indicate rather definitely that tests based upon 
factor analysis are of less practical value in guidance than are measures 
obtained by the older methods for development of aptitude tests. 

The last chapter is a review of test construction, with special em- 
phasis upon the measurement of idiosyncrasies. The procedures essen- 
tial to effective construction of such tests are described and explained. 
The discussion emphasized methods used by the College Entrance Ex- 
amination Board. An interesting detail is the fact that some of the 
methods developed for use with new-type tests have been applied to tests 
of the essay type. 

The appendices include some tables of statistical data, and some 
sample items suggesting the mental processes involved in the Yale battery 
of educational aptitude tests. 

The book furnishes clear insight into the activities characteristic of 
measurement in modern educational guidance. One need not accept all 
its conclusions; one can, for example, reconcile the views of the workers 
who still respect the IQ with the views of those who desire more analytic 
measures. The reviewer disagrees with the authors on several minor 
points, but finds the work as a whole characterized by sound judgment 
and good common sense. The book can be a very useful reference work 
for teachers of educational guidance, statistics, and tests and measure- 
ments. It is an important book for personnel workers interested in the 
selection and training of students in professional schools. 


Harold D. Carter 
University of California 


Book Reviews 429 


Braun, Carl F. Fair thought and speech. Alhambra, California: Pri- 
vately published, 1947. Pp, 50. $1.25 single copy. $1.00 twelve 
or more. 


The keynote of “Fair Thought and Speech” is to be found in an in- 
troductory quotation from John Ruskin, “We require from men two kinds 
of goodness: first, the doing of their practical duty well; then that they 
be graceful and pleasing in doing it; which last is itself another form of 
duty.” The major part of the book is devoted to a discussion of the 
means whereby men can be graceful and pleasing in doing their practical 
duty well. 

The text of the book is a letter which the author, as president of a 
manufacturing firm wrote, as part of a series, to each of his employees. 
The principles which the book enunciates are designed to apply with equal 
force to everyone,—to leader and workman alike. A fair indication of 
the content and general tone of the material is to be found in the chapter 
headings. Representative of these are: “Don’t Act Superior,” “Don’t 
Question too Fiercely,” ‘Don’t Be too Positive,” “Don’t Be Stifi- 
necked,” “Don’t Be a Worm,” “Don’t Be Unfair,” “Don’t Snap, Don't 
Scowl," “Ego,” and “Concession.” Аз may be readily observed, the 
author might better have used positive suggestion rather than negative 
suggestion in his approach. 

It is very evident that the author is making à sincere attempt to apply 
Christian principles to business intercourse. In fact, he makes frequent 
use of Biblical quotations to support his thesis. Typical of these are 
“A soft answer turneth away wrath; but grievous words stir up anger— 
Proverbs 15:1"; "Sweet language will multiply friends; and a fair- 
speaking tongue will increase kind greetings—Ecclesiastes 6:5”; and 
“Can a man take fire in his bosom, and his clothes not be burned?— 
Proverbs 6:27.” 

The book is simply and clearly written with short sentences and para- 
graph headings to emphasize and drive home the author's message force- 
fully. It is evident that the author especially feels the need for tolerance 
in human relations as he makes an outstanding plea for it under the сар- 
tion “Looking Down,” as follows: 


"Let's not set ourselves above others. Let's not think or talk 
about people Below us or Under us. Let's say, With us or Around us. 
Let's not spread information Down, but rather Ош. Let's not Hire 
people, but rather Таке them on or Have them join us. . . . Let's not 
talk of Superiors, but of Leaders. Let's not speak of Telling people, 
but rather of Asking. Let's have no talk of, 1 am better than thou." 


N Mr. Braun’s basic philosophy is clearly stated in his final chapter, 
By Little and Little,” as follows: “Little Drop: In human relations, per- 


430 Book Reviews 


haps more than in other things, success or failure is made up of little 
things. A friendly word a day will do the trick—will build success. An 
illiberal word a day, even one a week, will do the trick too, will dig a pit 
for any man. Illiberal words, missile words, condescending words, slip- 
pery words, sly words—let's drive them completely out of our thoughts 
and speech. 'Weight thy words in a balance, and make a door and bar 
for thy mouth.— Ecclesiastes 19:1.’” 

Тће sincerity of the author's purpose will be evident to every reader. 
It is only to be hoped that Mr. Braun's employees and the others to whom 
the book is directed will accept his words in the same spirit in which he 
has written them. 


Robert N. MeMurry 
Robert У. McMurry & Co., 


Chicago, Illinois 


Churchman, C. W., Ackoff, В. L., and Murray, Wax. Measurement of 
consumer interest. Philadelphia: University of Pennsylvania Press, 
1947. Pp.214. $3.50. 


А conference on the measurement of consumer interest, was called by 
а group of University of Pennsylvania philosophers with the coordination 
of research as its objective. This book presents a record of the pro- 
ceedings. 

The section on problems in practice covered a variety of somewhat 
unrelated topics in a fairly informal way: exaggerated responses in polling 

(Crossley); preference and performance (Preston); the research client as 
a problem (Blankenship); the problem of getting people interested in be- 
coming more efficient consumers (Doubman); the use of call-back in- 
terviews (Stock); and the researcher as a problem (Ellis). 

Tn contrast, the section on ways of evaluating preferences covered 
fewer problems but dealt with each one much more thoroughly and for- 
mally. Thurstone presented several theorems on the prediction of the 
frequency of first preferences when the scale values and discriminal dis- 
persions are known for each stimulus and developed a method of com- 
putation. Of special practical importance is his estimation formula which 
is restricted neither by the shapes of the affective distributions nor by 
their intercorrelations, for use in connection with the method of successive 
intervals. 

Irwin described several experiments to illustrate how preferences are 
affected by factors other than the physical characteristics of the objects. 
These illustrations should be studied carefully by any one who attempts 

to measure preferences or to interpret reasons which respondents give 
for their preferences. 


Book Reviews 431 


Guttman gave a step by step description of the use of the Cornell 
technique for scalogram analysis. In addition to the technique of con- 
tent analysis he described two methods of intensity analysis: the “fold- 
over" and “two-part” techniques. The description is detailed enough 
so that any one could employ these techniques in his own field. Even 
people who refuse to accept the specific techniques will have to admit 
that a real contribution has been made by proposing a solution to such 
problems as biases in question wording and determining whether an 
attitude is scalable. 7 

The wide area which the conference attempted to cover is illustrated 
by the banquet address on the meaning of consumer interest. This was 
а discussion of the consumer movement. 

The section on the meaning of measurement became more philo- 
sophieal. Singer struck the keynote of the conference by supporting 
cooperation rather than isolation. Deming distinguished between “quali- 
tive" and “quantitative” surveys and set up criteria for a satisfactory 
statistical program; and Churchman discussed the consumer and his 
interests. 

Perhaps no section supported the theme of the conference, the need 
for cooperative research, more strongly than the discussion of specifi- 
cations for consumers’ goods. Wilks and Peach presented a stronger case 
with specific examples than could have been presented with many more 
words in the form of generalities. In addition, the topic was well handled 
by Breyer, Curtiss, and Palmer, as well as by Wilks and Peach. 

The discussion of sampling techniques produced the usual points and 
issues. This would be expected since the sampling problems in the meas- 
urement of consumer interests are about the same as the sampling prob- 
lems in any other consumer field. 

The section on the application of the measurement of attitudes con- 
sisted of illustrations from two fields. Viteles discussed the measurement 
of employee attitudes and went far beyond the mere listing of the results 
of attitude surveys. He also pointed out how they should be combined 
with the results of direct experimentation and with other types of infor- 
mation in reaching practical conclusions. Cartwright gave a nontech- 
nical description of the research program used to guide the War Loan 
Drives. 

Evidently the conference accomplished a number of things, some of 
them probably unintentionally. It established the need for greater co- 
Operation in research. It demonstrated that the field of “the measure- 
ment of consumer interest” is loosely defined by covering a range of un- 
related topics many of which would be considered irrelevant to the topic 
by most people. It highlighted the terrific gaps in our knowledge as far 
аз this field is concerned. 


432 Book Reviews 


At any rate, the proceedings are well worth reading. They b 
together in convenient form material from a number of fields and 
points, and thus they provide a many-sided sketch of this important f 


Alfred C. Welch — 
Knoz Reeves Advertising, Inc., 
Minneapolis, Minnesota 


Moncrieff, R. W. The chemical senses. New York; John Wiley and 
Sons, 1946. Pp vii + 424. $4.50. 


In comparison with vision and hearing, man’s chemical senses 
bute little to his intellective activities. However, in terms of pers 
adjustment and social intercourse their place is by no means a lowly 
The use of perfumes is an old art. As far as the “stronger” sex is 
cerned, the satisfaction of the palate would rank well with sex in ma 
The chemical senses are entangled in a variety of ways in the econ 
and political struggle. In the business world, brand loyalties created by 
using a specific shade of flavor represent tremendous economic assets. 
Such seemingly unassuming problems as packaging of food, changes in 
flavor with storage, and rancidity of fats are actually billion-dollar ques- 
tions. The irritant gases are of both industrial and military concern. 

In view of these facts it is somewhat surprising that standard texts ol 
applied psychology barely touch on any of these topics. Thus Poffen-- 
berger’s Principles (1942) do not even include in the index such headings: 
as taste or gustation, and olfaction is mentioned briefly in connection 
with the use of psychological testing techniques for medical diagnosis. 

Moncrieff's aim was to coordinate the data of physiological psychology 
and of chemistry bearing оп the theory of chemical senses, and to present _ 
data which would be useful to individuals concerned with such problems _ 
as manufacture of perfumes and food production. Psychologists deal- - 
ing with flavor and odor will find in Moncrieff's book a valuable pref- 
ace, an “Einleitung” to this complex field. р 

"There is а glossary of over 300 terms, an extensive author index in- - 
cluding not only the name and page but also the topic in connection with — 
which an author is being cited, and an excellent subject index containing | 
some 4000 entries. The strong point of the book is the chemical treat- _ 
ment of the subject. The text on psychology of the chemical senses, 
particularly on applied psychology, is yet to be written. 


Josef Brozek 
Laboratory of Physiological Hygiene, a 
University of Minnesota 


New Books, Monographs, and Pamphlets 433 


New Books, Monographs, and Pamphlets 


Books, monographs, and pamphlets for listing and possible review should be sent to 
Donald G. Paterson, Editor, Department of Psychology, University 
of Minnesota, Minneapolis 14, Minnesota 


Counseling employees. Earl М. Bowler and Frances Т. Dawson. New 
York: Prentice-Hall, Inc., 1948. Pp. 247. $4.00. 

Psychology and military proficiency. Charles W. Bray. Princeton: 
Princeton University Press, 1948. Pp. 242. $3.50. 

The magic of believing. Claude M. Bristol. New York: Prentice-Hall, 
Inc., 1948. Pp. 245. $2.95. 

Applied psychology. Harold E. Burtt. New York: Prentice-Hall, Inc., 
1948. Рр. 821. $7.35. 

Public opinion and propaganda. Leonard W. Doob. New York: Henry 

у Holt апа Co., 1948. Рр. 600. $4.00. 

A human relations casebook for executives and supervisors. Frances and 
Charles Drake. New York: McGraw-Hill Book Co., Inc., 1947. 
Pp. 187. $2.50. 

The labor force in the United States 1890-1960. John D. Durand. New 
York: Social Science Research Council, 1948. Pp.302. $2.50. 

Emotional problems of living. O. Spurgeon English and С. H. J. Pearson. 
New York: W. W. Norton and Co., Ine., 1948. $5.00. 

Clerical salary administration. Leonard W. Ferguson, Editor. New 
York: Life Office Management Association, 1948. Pp. 220. $4.00. 

Sickness absenteeism among male and female industrial workers, 1937-1946, 
inclusive. W. M. Gafafer. Washington, D. C.: Superintendent of 
Documents, U. 8. Government Printing Office, 1947. Pp. 4. $.05. 

Guide to occupational choice and training. Walter J. Greenleaf. Wash- 
ington, D. C.: Federal Security Agency, Office of Education, 1947. 
Рр. 150. $35. 

Shakespeare’s Hamlet. Ernest Jones. New York: Funk and Wagnalls 
Co. 1948. Рр. 180. $2.50. 

Principles of personnel testing. C. H. Lawshe, Jr. 
Hill Book Co., Inc., 1948. Рр. 227. $3.90. _ 

An introduction to clinical psychology. L. A. Pennington and I. A. Berg, 
Editors. New York: The Ronald Press Со., 1948. Pp. 595. $5.00. 

Psychology and life. ‘Third edition. Floyd L. Ruch. New York: Scott, 
Foresman and Co., 1948. Pp. 782. $3.60. ) 

Evaluation of group guidance work in secondary schools. Georgia M. 
Sachs. Los Angeles: University of Southern California Press, 1948. 
Pp.120. $2.50. А х 

he unfolding of artistic activity. Henry Schaefer-Simmern. Berkeley: 
University of California Press, 1948. Рр. 202. $5.00. 


New York: McGraw- 


434 New Books, Monographs, and Pamphlets 


Psychology for living. Herbert Sorenson and Marguerite Malm. New 
York: MeGraw-Hill Book Co., Inc., 1948. Рр. 637. $3.00. 

Difficulty prediction of test items. Sherman Tinkelman. New York; 
Bureau of Publications, Teachers College, Columbia University, 1947, 
Рр. 55. $1.85. 

Social psychology. Wayland F. Vaughan. New York: The Odyssey 
Press, Inc., 1948. Pp. 956. $5.00. 

American Psychological Association 1948 directory. Helen М. Wolfle, 
Editor. Washington, D. C.: American Psychological Association, 
1948. Pp. 429. $3.00. 

Building self-confidence. C. Gilbert Wrenn. Stanford: Stanford Uni- 
versity Press, 1947. Рр. 32. $.35. 

A 1948 survey of office salaries. American Business Report. Chicago: 
Dartnell Publications, Inc., 1948. Pp. 64. Report included with 
subscription to American Business at $5.00 for 15 issues. 

Developing public and industrial relations policy. General Management 
Series No. 140. New York: American Management Association, 
1947. Pp. 52. $1.00. 

Employee counseling services. Selected References No. 20. Princeton: 
Princeton University Industrial Relations Section, 1948. Рр. 4. $.10. 

Plan for action work kit, including report on employee opinion surveys. 
New York: Joint Committee Headquarters, Room 1750, 420 Lexing- 
ton Ave., 1947. $5.00. 

Labor market information. (area series). Washington, D. C.: Superinten- 
dent of Documents, U. 8. Government Printing Office, 1948. Issued 
monthly, $2.50 a year. 

Labor market information (industry series). Washington, D. C.: Super- 
intendent of Documents, U. 8. Government Printing Office, 1948. 
Issued monthly, $1.00 a year. 

Lighting schoolrooms. Washington, D. C.: Superintendent of Docu- 
ments, U. S. Government Printing Office, 1948. Pp. 17. $.10. 

Principle of equalization applied to the allocation of grants in аза. Wash- 
ington, D. C.: Superintendent of Documents, U. S. Government 
Printing Office, 1948. Рр. 225. $.75. 


Journal of Applied Psychology 


Vol. 32, No. 5 October, 1948 


Opinions of Residents Toward an Industrial Nuisance * 


Kenneth E. Clark 
University of Minnesota 


When is an industrial establishment a value, and when a nuisance to 
the neighborhood in which it is located? This question is usually an- 
Swered on an a priori basis by the city fathers who frame and enact zoning 
ordinances.! It may be answered at least as well, however, in terms of 
the reactions of residents who must live in the neighborhood of a particu- 
lar plant or area, and whose everyday existence is influenced by its pres- 
ence. The present paper reports an exploratory study of this question 
in one area of a large midwestern city, in which the criterion of value or 
nuisance of an industrial establishment is the expressed opinions of the 
residents living in the vicinity of the plant. 


The Problem 


The company whose role is under study is located in what is now an 
almost completely residential area in а large midwestern city. This 
residential area is bordered by а heavy industrial area to the north and 
east, but has grown and enhanced its own value because of its proximity 
to a large university. Many of the homes nearest the plant represent 
investments of from $15,000 to $35,000 in terms of 1948 values. The 
chief industrial “infection” in the midst of this residential area comes 
from one company, herein called Company X, and from a spur railroad 
track which runs through the residential area and services this company. 

Company X is an oil company which stores and distributes fuel oil, 
gasoline, and petroleum products. The company is not a newcomer to 
this area—its plant existed on a small scale before many of the better 
residences were built in the neighborhood, and pre-dates the present city 


* This study was made possible by support from the research funds of the Graduate 
School of the University of Minnesota. | Un у AP 
1 While it is true that planning engineers are primary participants in determining 
appropriate land use, and they do attempt to take into account “neighborhood opinion, 
nevertheless opinion polling techniques are rarely used, if ever, in establishing opinion 
gradients for zoning purposes. 
435 


436 Kenneth E. Clark 


zoning ordinance which forbids the development of similar heavy in- 
dustries in this area, ог the growth and expansion of the present plant 
because it is a “non-conforming use." During the past twenty-five years, 
and especially during the past five years, residents in the area have di- 
rected concerted efforts to prevent Company X from establishing more 
firmly its position in its present location, while Company X has fought 
with equal vigor to protect its present investment in facilities, and for 
permission to use more completely the space within its premises as well 
as to expand beyond its premises. 

The renewed attention of the neighborhood was directed to this plant 
and to the possibilities of danger from the stores of gasoline and fuel oil 
when during the winter months of 1948 a fire broke out in the garage at 
4:30 a.m. and threatened to spread to the rest of the plant. The oil and 
gasoline storage tanks are within 150 or 200 feet of the garage. Fortu- 
nately, the city fire department was able to confine the fire within the 
walls of the garage in which a dozen or more gasoline trucks burned with 
a number of minor explosions. It was this occasion which suggested the 
usefulness of a survey of opinion as a means of reflecting the varied feelings 
of residents in the neighborhood of Company X. Within a month after 
this fire occurred, the survey of opinions reported in this article was 
conducted. 

Procedure 


A sample of residents in the neighborhood was selected by interview- 
ing one person from every third household as listed in the City Directory. 
No household existed in the neighborhood which was not listed in this 
directory. The area included in the survey was limited to that segment 
which was presumably most affected by the presence of the plant. The 
geographic area so included was large enough so that some of the house- 
holds were located as much as two-thirds of a mile from the plant. 

Interviewing was done by student interviewers being trained in а 
course in the principles and techniques of public opinion analysis. Re- 
spondents to be interviewed were limited to responsible adults of ages 
twenty-one or over actually living in the households being sampled. 
Callbacks were made until interviews were completed or refused, or until 
it was ascertained that the entire family was out of town, or until four 
calls had failed to yield results. Usable interviews were thus completed 
for 144 out of 152 households listed in the sample. 

The questionnaire which was used consisted of two parts. The first 
part contained six questions relating to opinions of the effect of the plant 
on property values, and on issues of plant building and expansion. The 
second part of the questionnaire relates to attitudes toward the plant as а 

hazard which constitutes a threat to their safety. The panel of residents 


Opinions of Residents Toward an Industrial Nuisance 437 


was also divided into two groups. One groups was the "near" group— 
those residents who live closest to, and are most affected by the presence 
of the plant, and a "far" group, which was comprised of residents living 
at a greater distance from the plant. 


Results 


Presented below are the responses obtained to each of the questions. 
The first six questions are presented as a group. The first five deal 
primarily with plant building and expansion, an issue which is an im- 
portant one involving several battles in the city planning commission, 
the city council, and in the courts in recent years. The sixth question 
taps opinions as to the effect of the plant on residential property values. 


% Ф No 
Yes No Opinion 

1. “Do you think Company X should Total 27 54 19 
be permitted to build a second Near Group 19 60 21 
story on its office building?” Far Group 34 50 16 

2. “. . . to build a warehouse next to Total 17 16 7 
their storage tanks?" Near Group 20 76 4 
Far Group 19 ТИ 5 12 

3. “. . . to increase the number of Total 9 83 8 
oil and gasoline storage tanks?” Near Group 12 84 4 
Far Group 7 83 10 

4. ", . . to expand their plant across Total 6 87 7 
the railroad tracks?” Near Group 0.87, 4 
Far Group 4 88 8 

5. “. . . to rebuild the garage which Total 63 27 10 
was damaged in the recent fire?" Near Group 61 30 9 
Far Group 64 24 12 

6. "Do you believe that property Total 58 27 15 
values in your immediate neigh- Near Group 59 . 27 14 
borhood are lowered by the pres- Far Group 60 26 14 


ence of Company X?" 


Even though the six questions above deal with the general problem of 
plant, expansion and property values, responses to each are highly specific, 
and indicate no general “halo” of totally favorable or totally unfavorable 
attitudes. Thus, while but six per cent of the entire group favor expan- 
Sion of the plant across the railroad tracks, a large majority (63 per cent) 
are willing to have the company repair the damage caused by the recent 


438 Kenneth E. Clark 


fire. Equally of interest is the fact that residents near the plant express 
opinions on the six issues presented which are almost identical with those 
of residents living at a greater distance. 

The following questions relate more directly to the attitudes of 
residents to the plant and its influence in the neighborhood, as a hazard 
which constitutes a threat to their safety. These responses are likewise 
presented not only for the total group, but for near and far residents as 
well. 


% : % No 
Yes No Opinion 
7. “Do you believe there is danger Total 53 38 8 
to your property from the Com- Near Group 67 28 5 
pany X plant?" Far Group 40 49 11 
8. “How strongly do you feel about Answered Question 7: 
this?" Yes (53%) No (39%) 
Very strongly 55% 14% 
Rather strongly 33 25 
Not strongly at all 12 44 
No opinion 0 17 
100% 100% 
€, % No% 
Yes No Opinion . 
9. “Were you at home and awake Total 37 82 il 
during the recent fire in the gar- Near Group 52 43 5 
age at this plant?" Far Group РАК | 14 
10. “Did you feel any fear or anxiety Total 32 66 2 
about the possibility of fire (N = 53) 
spreading to your home?" (Asked Near Group 47 53 0 
only of those answering Question (N = 34) 
9 “Үев”). Far Group 5 90 5 
(N = 19) 


The responses to these questions relating to the hazards associated 
with a plant of this kind certainly suggest that the relative distance be- 
tween the household and the plant is one of the chief factors which deter- 
mine attitudes toward the plant as a fire hazard. This is particularly 
true when one examines responses to the questions directly related to the 
fire for the near and far residence groups. Further evidence to support 


this view is obtained when respondents are grouped according to the - 


streets on which they live. On almost every street in the neighborhood, 
the total pattern of responses of residents near the plant was less favorable 


Opinions of Residents Toward an Industrial У uisance 439 


to Company X than for the residents living on the more distant part of 
the street. This “opinion gradient” is observed when a single total score 
(based on all questions) indicating favorable or unfavorable attitudes 
toward the plant is used. However, the responses to individual questions 
presented heretofore suggest that this gradient results primarily from 
differences in feelings regarding danger from the plant rather than from 
differences in feelings about plant expansion and property values. 

Two different dimensions of opinion seem, therefore, to exist. The 
first dimension deals with the influence of this “nuisance” plant on the 
residents in the community, with primary emphasis on the effects it has 
on property values, and the desirability of living in such a neighborhood. 
On this dimension, little difference is observed between the near and far 
residents. That there should exist such an absence of differences of 
opinions between groups living in the shadow of a large oil storage plant 
and groups living at a distance may be a result, in part, of the vigorous 
campaign of education which has been conducted by the local residents’ у 
improvement association regarding the desirability of restricting a “поп- 
conforming use" plant in the area. This “leveling” of attitudes is an 
an indication of the success with which this association has conducted its 
educational campaign. As a matter of fact, at two annual meetings of 
the group and in the mimeographed annual reports sent to all members, 
an outstanding authority on city planning had presented the basie prin- 
ciples of zoning and appropriate land use, stressing the problems of blight. 
in residential areas as a result of failure to protect such residential areas 
from encroachments of industry in general and of non-conforming nui- 
sance industries in particular. 

Fully as significant as this absence of difference in opinions of near 
and far residents regarding their property values is the marked difference 
between these two groups regarding their fears and anxieties arising from 
the recent fire, and the possibilities of another and more serious fire of 
catastrophic proportions. ‘The responses to two free answer questions 
clearly highlight this difference, as do the responses to questions seven 
through ten. Typical comments regarding the plant were: “Definite 
danger from explosion and fire. The traffic is also too heavy for a resi- 
dential area partly due to the plant.” “It’s a fire hazard.” “When the 
wind is blowing from that direction, there is danger to my home in the 
event of an explosion." “Could explode at any time . . . it was just а 
Miracle that the fire department was able to control it . . . that's all 
that saved us.” “The trucks are a danger to the children." “The 
gasoline tanks are the only cause of concern.” “Во much gas and oil— 
an explosion might involve anyone within a quarter of a mile or half mile 
radius.” “There is always danger from an explosion. In this last fire, 


440 Kenneth E. Clark 


what if the fire had reached the tanks?" ‘Danger of explosion. We're 
close enough here. It would probably blow the windows out of the 
place.” “The fire hazard is the particular thing. A bad fire could 
seriously damage the community." “From бге and then the traffic and 
loading—so many trucks and trains.” “I think we are all in danger from 
areal big бге. If those barrels of tar had gone up in the air in the earlier 
warehouse fire it would have been bad. Another fire would be on a bigger 
scale.” 

Eleven per cent of the total sample commented on another free an- 
swer question asking whether any steps had been taken to protect prop- 
erty during the fire. Some of the comments follow: “No. What can 
you do? Didn't know what steps to take. Before the war we wanted to 
sell and get away, and then thought we'd better stay some place.” 
“Figured there wasn’t much we could do—just stand and watch." “We 
carried valuables away from the house. Smoke was rolling in the 
windows. Rooms filled with smoke.” “Соё personal papers ready to 

“leave.” “Not much we could do.” “Соё dressed and ready to leave if 
it spread. Nothing possible to protect property.” ‘Got children up 
and dressed in case they'd have to move.” “We just got away from here. 
We were protected by firemen.” “We took things into kitchen.” 

When the responses to all items on the questionnaire are summated, - 
assigning plus values to responses favorable to Company X and minus - 
values to unfavorable responses, it is possible to obtain a distribution of 
scores which reflects degrees of attitudes towards the company. These 
summated response distributions indicate a less favorable attitude to- 
wards Company X on the part of those households living close to the 
plant than exists for those living farther away. The extent to which 

_ these feelings are fear and anxiety responses rather than responses based 

on property issues is indicated when separate scores are obtained for 
items 1 to б and for items 7 through 10. In the latter group of items was 
included a scoring of responses to the two free answer questions intended 


Table 1 
Means and Standard Deviations of Distributions of Scores for “Near” and “Far” 
Residents on Attitudes on Plant Expansion and Effects of Plant on Property Values, and 
on Attitudes of Fear and Anxiety Toward the Plant as a Menace to the Community. 
ee 


“Near” Residents “Far” Residents 
N = 77) 


(N = 67) Mean | 
Mean SD Mean SD pif. СЕ. 

Score on Plant Expansion , 
and Property Value Questions. 845 3.39 834 2.95 0.13 0.22 


Score on Fear Questions. 491 2.28 392 233 099 258 
——————— i I P hE ee Se ae |- 


Opinions of Residents Toward an Industrial Nuisance 441 


to indicate strength of feelings. Means and standard deviations are 
presented in Table 1 for the distribution of scores on attitudes regarding 
"property" issues, and on "fear" attitudes, for both near and far resi- 
dents. ‘These results indicate differences significant at the one per cent 
level between near and far residents on fear issues, but negligible differ- 
‘ences between these residents on property issues. 

Discussion 

The preceding analysis and presentation indicates the feasibility of 
using publie opinion techniques for collecting data to point up the need 
for continuance or change of zoning policies toward specific “sore spots" 
within municipalities. Of particular interest in this survey is the finding 
that distance from an industrial establishment has a marked influence on 
the reactions of residents in the neighborhood to its possible danger, and, 
perhaps, may also indicate annoyance with the traffic which develops 
because of its presence. Thus, an easily determined “fear gradient" can 
be disclosed. The presence and the intensity of the anxiety is found to 
be inversely related to the distance at which residents in the neighborhood 
live from the “fear object.” 

On the other hand, distance from the establishment is found to have 
little influence on attitudes towards plant development and expansion 
when good lines of communication exist among the residents of the 
neighborhood. Thus, in the present survey, a long-time program of 
education by the community organization has so directed the attention 
of the residents at a distance from the plant to its effect on them and their 
property as to eliminate almost completely the factor of distance in deter- 
mining opinions regarding the effect of this particular company’s presence 
on property values throughout the surrounding residential area. i 

The survey here reported suggests the possibility, indeed, the desir- 
ability, of a policy of adding opinion polling experts to the staffs of city 
planning commissions to measure the opinions of residents and of in- 
dustrialists in those areas where sharp conflict exists. Such opinions 
could also be measured from time to time to determine shifts of opinion 
which should be taken into consideration in policy determination. Per- 
haps the method of public hearings, now universally used, will come to 
be supplemented by scientific evidence of opinion, fairly and systemati- 
cally gathered. Here is one more area in which the science of public 
Opinion polling can be utilized as а means of providing publie policy 
makers with data now provided only in a haphazard manner by those 
Who are able and willing to attend publie hearings. The democratic 
Process of giving all concerned a “hearing” would be enhanced by the 
Polling technique which gives the members of each and every group an 


442 Kenneth E. Clark 


opportunity to be heard, by means of data systematically gathered, and 
objectively analyzed and reported. е 


Summary 


1. Opinions of residents in the neighborhood of an industrial nuisance, | 
а gasoline and oil storage and distribution center, were determined as 
evidence regarding the appropriate zoning of this area. А majority of 
residents in the area held opinions unfavorable to expansion or entrench- 
ment of the plant in the neighborhood. 

2. When residents were split into two groups, one living near the 
plant, the other living at a greater distance from the plant, little difference 
is observed between groups in their attitudes towards plant expansion 
and towards the effect of the presence of the plant on their own property 
values. Marked differences are observed, however, when opinions to- 
wards the plant as а serious threat to their safety and the safety of their 
property is tapped, the group living near the plant displaying more fear 
and anxiety than those living at а distance. 

3. These findings suggest the desirability of use of opinion polling 
techniques as an aid to planning authorities, not only to give guidance in 
the zoning of communities, but to assure that each group in the neigh- 
borhood has an opportunity to be heard. 


Received March 10, 1948. 


i 


The 97th Psychological Barometer * 


Henry C. Link and Albert D. Freiberg 
The Psychological Corporation, New York City 


Тће great issue in the world today seems to be that between state 
capitalism (sometimes labelled fascism, communism, socialism, totali- 
tarianism) and private capitalism (sometimes labelled free enterprise or 
the American way of life). The Psychological Barometer has included 
some questions which bear directly on this issue. 


Faith in Government vs. Faith in Business Management 


In the April Psychological Barometer, a question asked in three previous 
Barometers was repeated to measure trends in people’s attitudes toward 
Government control and management of business. The results of the 
four studies are given below. 


Q. “If all manufacturing companies were completely managed by the 
overnment instead of by private management as at present, would you get 
more for your dollar or less?” 


——— 


Oct. May May Apr. 
1945 1946 1947 1948 


Would get less 38% 35% 36% 49% 
About the same 24 у 9 14 10 
Would get more 19 24 20 20 
Uncertain 19 32 30 21 


Total Interviews 2500 5000 5000 2500 


The greater efficiency of private management as compared with 
Government management is much better recognized today, according to 
these results, than a few years ago. The latest results show that faith 
1 the efficiency of the Government is highest among the less educated 
and the skilled and semi-skilled “О” and “D” wage earners group, and 
Among respondents in union families. 

* This survey | i i ical Barometers, the oldest nation-wide 
y is the 97th in a series of Psychological ,u 

Poll of public opinion and buying habits now in existence. Begun in March 1932, these 

був are made four times a year with 10,000 personal interviews and twice a year with 

Interviews, Details of sampling and method are given at the end of this report. 


443 


444 Henry C. Link and Albert D. Freiberg 


Union Membership 
Socio-Economic Groups Se 


ee Union Non- 

A B С р Members Union 

Would get less 6205 57% 4695 36% 45% 51% 
About the same 6 9 11 10 11 9 
Would get more 13 17 20 29 24 19 
Uncertain 19 17 23 25 20 21 


TotalInterviews 250 750 1000 500 698 1802 - 
ee 


Another question pertaining to people’s faith in the Government as 
compared with their faith in business and in unions was: 


9. “Tn the long run, who does the most for the good of the workers: their 
employer, the Government, the unions?" 


Ш і 
Union Membership 
Socio-Economie Groups — — À Gros 


Кос оул | Union Non- 

Total A B © р Members Union 

Employer 3795 50% 45 35 23% 24% 42% 
Unions Oa ee 1^. Be 
Government 15 10 13 13 21 11 16 


Uncertain 13 16 12 14 13 10 14 
EE е ВОВ ___. === 


Again, there is considerable variation between the opinions of people 
in the upper socio-economic levels and of people at the lower end of the 
scale, and between respondents in union families and other people. 

Three other questions related to people's faith in Government and 
private enterprise were also asked: 


Q. “If you could choose any job you wanted to, would you rather work for 
а private business company, or for some department of the Government? 
е о c ede prc ee 


Socio-Economie Groups 


Answers Total A B С р 
Private company 61 79 68% 58% 45% 
Government department 012 iB 27 ч 34 48 
Uncertain 6 3 5 8 1% 
Union Membership — 
‘Union Мешел 
REEL LAE og Union Non- 

‚ Ànswers Men Women Members Union 
Private company 65 56% 53% 637% 
Government department a 35, 40 31 
Uncertain 4 9 7 


Uncertain "ЧАЛЛЫ ЛЕ s — шшш. 


The 97th Psychological Barometer 445 


Q. ‘‘What is the safest way to invest your savings: in a savings bank, in life 
insurance, in Government savings bonds, in bonds of leading companies, in 
stocks of leading companies, in real estate?" 


Socio-Economic Groups 


Answers Total A B с р 
Government savings bonds 6595 66% 65% 65% 64% 
Life insurance 13 16 15 18:5 9 
Savings bank 1 3 9 12 15 
Real estate 9 10 11 8 8 
Stocks of leading companies 1 2 1 1 1 
Bonds of leading companies 1 2 1 1 1 
Uncertain 4 6 3 3 6 


Q. “Where will your savings earn the most money and still be fairly safe?” 


Socio-Economic Groups 


Answers Total A B с р 
Government savings bonds 45% 36% 43% 47% 51% 
Life insurance 5 8^ ne 9 8 6 
Savings bank 6 4 4. 7 10 
Real estate 12 13 15 12 9 
pon 2: ш companies В at " : р 

onds o i i 
UE eading companies z 3 * 5 5 
Uncertain 11 12 8 11 13 


ез ALIE DU ЕЕЕ 


People in all walks of life have a higher regard for Government savings 
bonds than for any other form of investment, both in respect to safety - 
and profit, 


Free High School and College Education 


In view of the recent report of the President's Commission on Educa- 
tion recommending that the Government provide a college education for 
every boy and girl, the results of the following two questions are of 
Particular interest. 


Q. “Do you believe every boy and girl is entitled to a four-year high school 
education paid for by the State or Government?” 
nn 


Socio-Economic Groups Sex 
Total A B С р Меп Women 


87% 86% 86% 87760 88% 86% 88% 
9 10 11 9 1 11 
4 4 3 4 5 3 5 


446 | Henry C. Link and Albert D. Freiberg 


9. “Do you believe that every boy and girl is entitled to a four-year college 


education paid for by the State or Government?" 


Socio-Economic Groups 
Total A B С р 


Yes 39% 27% 35% 40% 52% 
No 51 65 58 50 35 
Uncertain 10 8 7 10 13 


Men 
38% 
54 

8 


Sex 
Women 
41% 
47 
12 


The general public then does not concur with the President’s Com- 


missions report on college educations. 


The principal reasons given by the people who were opposed to a 
Government paid four-year college education for every boy and girl were 


as follows: 


Better for initiative for people to stand on their 
own feet; students should work their way 
through college; it’s a family responsibility; 
shouldn’t be a gift 

Too many .unqualified people would go and 
therefore our educational standards would be 
lowered 

Too expensive; taxes would be too high 

Only if people are qualified for college; only if 
they have the mental capacity 

High school education is enough; college educa- 
tion isn’t necessary for everyone 

Not democratic; not the Government's re- 
sponsibility 


Total 


33% 


27 
16 


7 
5 
5 


Those favoring a free college education for everyone, when asked why, 


gave such answers as: 


Many who need and want higher education can’t 
afford it; would give everyone an equal 
chance 

People need a college education to get along 

Deserving and qualified people should get 
college education 

Would make a better world; would raise the 
standard of living 

People are entitled to it as citizens; the Govern- 
ment should help children 


Taxes are high enough to cover it; it should be 
paid for by taxes 


Labor Saving Machines and Employment 


4 
| 
"Total | 
| 


26% 
16 | 


The 97th Psychological Barometer 447 


Q. “По new labor saving machines like steam shovels, cotton pickers, 
power lawn mowers, etc., increase or decrease the number of jobs in the long 
run?" 


Socio-Economic Groups 


Answers Total A B Cc D 
Increase jobs 48% 57% 55% 47% 3295 
Decrease jobs 5 38 39 45 55 
Uncertain { 5 6 8 18 


In spite of the conspicuous example of the effect of machinery in the 
United States, almost one-half the population still clings to the fallacy 
that machines decrease jobs in the long run. Education still has a big 
task to perform at this point, especially among women. 


Union Membership 


pex Union Non- 

Answers Men Women Members Union 

Increase jobs 54% 41% 42% 50% 
Decrease jobs 39^ 51 49 43 
Uncertain 7 8 9 7 


1 Pesce L by л ele НЫ 


Strength of Anti-Communism in the U. S. A. 


Two different questions were asked on Communism in the April 
survey: 


Q. “Which of these two statements expresses your own idea best: (1) 
Communism is like other political parties in the U. 8. А. which try to put over , 
their ideas; or (2) Communism is like a fifth column which is loyal to Russia 
first and to the U. S. A. second?” The unanimity of the urban population on 
this question is of particular interest. 


m——————————————————————— 
Union Membership 

Socio-Economie Groups "Union Non- 
Answers Total <A B С D Members Union 


Communism like 
inh column 74% 88% 779 13% 66% 73% 74% 
munism lik А 
other рага 3 10 9 п 10 9 10 10 
Beertain 16 $799 и 225 17 16 


ae ee ee р M алышы 


448 Henry C. Link and Albert D. Freiberg 


. “If the Communists take over Italy as they took over Czechoslovakia, 
should the United States declare war on Russia?" 


Union Membership 
Socio-Economic Groups —__ M 


BEST testes) 7 on pelea Union Non- 
Answers Total A B с р Members Union 
Yes 31% 29% 31% 31% -33% 34% 30% 
No 46 54 48 45 39 45 46 
Uncertain 23 17 21 24 28 21 24 
га SE Se _____________. 5 


Palestine 


Q. “Are you in favor of splitting Palestine into two parts, one for the Jews, 
the other for the Arabs?” 


Socio-Economic Groups 


ПИ eee e > ee 
Answers "Total А в С р 
Yes 45% 43% 46% 46%, 39% 
No 27 29 28 29 25 
Uncertain 28 28 26 25 36 


There was practically no difference between various income levels on 
this question. However, people from the East and Far West were 
MEN more in favor of partition than people from the Mid-West and 

outh. 


Geographic Areas 


Mid- Far 
Answers East West South West 
Yes 50% 42%, 38% 46% 
No A ИТЕ 28° 
Uncertain 27 27 31 26 


СВИ UNE. cH -— X» E 
Although almost one-half of the American people are in favor of 
partitioning Palestine so as to make'a separate Jewish State, 60 per cent, 
as shown below, are against sending soldiers to help enforce this partition. 
Q. “Would you favor sending U. S. troops to Palestine to keep order?" 
=——————————————Є————— 
Socio-Economic Groups 


Answers Total A B С р 
Yes 25%, 20 26 24 26% 
Хо 60^ a0” 0% a 52 


Uncertain 15 11 13 15 22 


The 97th Psychological Barometer 449 


Here again, there were no important differences by income groups. 
The Mid-Western respondents, however, were most strongly against the 
sending of soldiers to Palestine. 


Geographic Areas 


Mid- Far 
Answers East West South West 
Yes 28% 20% 26% 25% 
Хо 58 67 55 57 
Uncertain 14 13 19 18 


А cross analysis of the answers to the two questions shows that only 
15 per cent of all people questioned are both in favor of the partition and 
in favor of sending troops. 


Is the Oleomargarine Tax Fair? 


Attempts to remove the ten cent a, pound tax on pre-colored ole- 
omargarine have met considerable opposition in Congress. Yet, three- 
fourths of the urban population said they believed this tax was unfair 
when they were asked: 


Q. “You pay a tax of 10¢ a pound on oleomargarine if it is colored before 
you buy it. Do you think this tax is fair or unfair?" 


a 


Socio-Economic Groups 


Answers Total A B С р 
Unfair 75 77 78% 74% 70% 
Fair 14° 17 12^ 15 16 
Uncertain 11 12 10 11 14 

Geographic Areas 
Sex Mid- Far 

Answers Men Women East West South West 
Unfair 79 71 74% 75% 77% 18% 
Fair 19% Оаа 13 17 
Uncertain 8 14 13 11 10 10 


Men are strongest in their condemnation of the tax but the large 
majority of all groups are opposed to it. 


Profits 


In the May 1946 Psychological Barometer and again in April 1948, two 
Questions were asked to determine people's opinions concerning the 
Profits large companies make and the profits they should make. 


450 Henry C. Link апа Albert D. Freiberg | 


Q. “Out of every dollar which large business companies take in, about how 
many cents do you think they keep as clear profit?" 


Q. “How many cents out of every dollar do you think they should keep as 
a fair profit?" 


Е 


Profits Profits Companies 

Companies Keep Should Keep 

May Apr. May Apr. 

Cents of a Dollar 1946 1948 1946 1948 

Under ~. 18% 24% 17% 28% 
10¢-29 7. 40 37 42 45 
306 апа оуег 32 29 15 17 
Uncertain 26 10 26 10 


Still, the great majority have a highly exaggerated idea of company 
profits. Indeed, 66 per cent think these profits are from less than one to 
seven times as high as they actually are. The actual profit, according to 
Government reports, is less than 10£ on a dollar of sales even at present 
high profit rates. 


The Relative Standing of Various Organizations 
Two different questions on people's attitudes toward various organiza- 
_ tions were included in the April survey. | 


Q. “Which of the following organizations do you think well of and which 
not so well of?” : 


Organization Total Organization Total 
Y. M. C. A, American Red Cross 
Well 90% Well 72 
Not so well 6 Not so well 25 
Doubtful 4 Doubtful 3 
Boy Scouts Salvation Army 
Well 97 Well 89 
Not so well 2 Not so well 6 
Doubtful 1 Doubtful 5 


Ит. с. созш гу уул Me c А Игала виолине ed ЗИНОИ РЕГАН, О 


The Red Cross has considerably less favor than the other three 
organizations. 
The second of these had been asked in April 1946 and was repeated | 
to measure trends. The results of both surveys are shown below. | 


LI 


The 97th Psychological Barometer 451 


Q. “Which of the following organizations do you think well of and which 
not so well of?” 
ss M 
Change 
Apr. Apr. Since . 
1946 1948 April 1946 


The U. 8. Chamber of Commerce 


Well 65% 69% +4% 

Not so well п 8 –3 

Doubtful 24 23 -1 
The А. F. of L. 

Well 50 

Not so well 31 

Doubtful 19 
National Ass'n of Mfrs. 

Well 37 40 +3 

Not so well 17 16 -1 

Doubtful 46 44 -2 
The American Legion 

Well 77 76 -1 

Not so well 15 13 -2 

Doubtful 8 11 +3 
The C.I.O. 

Well 26 

Not so well 56 

Doubtful 18 


Explanation of the Survey 


The current survey was made with 5000 interviews during April 1948; by 
393 interviewers under the supervision of 150 psychologists and in. 148 cities 
and towns. It represents a true cross-section of the urban population. Two 
questionnaires were used, one with half the sample, or 2500 people, the other 
questionnaire with the other half of the 5000 people. ‘These two samples were 
comparable by geographic, sex, socio-economic, and other criteria. A 

‚ Sampling Method. A modified area sampling method was used. All inter- 
views were assigned by the local supervising psychologist by blocks and streets 
ìn accordance with maps constructed to designate the proper socio-economic 
evels. These maps are made to divide the population into four principal 
groups, the “A” group consisting primarily of owners and executives, the ut 
Сэр, primarily white-collar and semi- rofessional, the “С” group or 

асќогу and transportation workers, “D ' group or the less skilled. About 27 

т cent of the sample were union members. All interviews were made in the 
Оше, but only one in а family, half were made with women, half with men. 


Received June 29, 1948. 
Early publication. 


А Farm Knowledge Test 


Austin E. Grigg 
Medical College of Virginia 


In classification studies, such as practiced in prisons, courts, у 
rehabilitation agencies, etc., it is now common practice to include 
ological tests which purport to evaluate the individual's aptitude and/or 
achievement in such broad work areas as mechanical and clerical fields. 
Tests of educational achievement often are included also. When the 
classification program is conducted among those of largely rural back- 
ground, the need for some objective measure of farm knowledge becomes | 
apparent. The present article is a report of a farm background test which 
was devised primarily as a part of the vocational appraisal of male adult 
prisoners at the Virginia State Penitentiary. The test was devised in 
response to a direct request from the Classification Department which 
pointed out the large percentage of prisoners with rural backgrounds. 


Selection of Test Items 


Experience had taught that most of the men on whom the test would 
be used came from relatively poor farm areas where general rather than | 
specific crop farming was practiced. Studies had also demonstrated that 
the group on whom the test would be used would, for the most part, be 
poorly educated. Mental age statistics made at several different peri 
found the group to average dull normal intelligence, with a skew to the 
lower mental levels noted. И 

It was decided, therefore, to select only those items which would re- 
flect a general knowledge of varied aspects of farming, but which would | 
not sample items of a managerial nature. Since men from all sections of 
the State of Virginia would be included in the test population, it жей 
decided to make the test cover as many aspects of farming as possible: 
crops, fruit trees, livestock, market facts, etc. 

Two staff members of the State Department of Agriculture were 
sulted and sample items were constructed with their assistance. ( 
was necessary because the psychologist in the case is city bred.) Also, К 
three experienced farmers were consulted and they were asked to list 
everyday farm facts which any normal farm boy should know. From 
these sources, over 200 items were constructed. With the assistance © 
the three experienced farmers, a final selection of items was. made. 

452 Ш f 


con- 


A Farm Knowledge Test 453 


cause the test would most often be used in group situations, the items 
were then written for multiple-choice style and the final format checked 
again by the three farmer advisers. 


Description of the Test 


The final test, as now used by the Virginia State Penitentiary, follows. 


Answers which the three farmer advisers agreed should be keyed as correct 
are underlined. These answers apply only to Virginia. 


2. 


1. 


16, 


When is winter wheat sowed? 
Sept. Oct. Nov. Рес. Jan. Feb. 


When is winter wheat harvested? 
March & April May & June June & July July & August 


. What is the average per acre yield in b coed for wheat? 


0-10 bushels, 10-20 bushels, 20-30 bushels, 30-40 bushels 


. When is corn planted? 


Jan. thru Mar, Apr. thru June, July thru Sept. 


. What is average yield of corn per acre in Virginia? 


30-40 bushels, 50-60 bushels, 70-80 bushels, 90-100 bushels 


. When is corn harvested? 


June & July, July & August, August & September, September & 
etober 


- When is tobacco marketed in Virgi 


nia? 
Sept thru Jan; Jan thru Apr; Apr thru Aug. 


. Hereford, Shorthorn, and Pole Angus are kinds of: 


Beef cattle, milk cows, sheep, draught horses 


. Which of the things listed below is common disease among livestock? 


Spasmosis, Black leg, Lick tongue, Ring foot 


- Which of the below is a common disease among poultry? 


Cholera, New Craw, T.B., Ring Foot 


- Which one of the crops below grows best in sandy soil? 


Corn; Wheat, Potatoes, Oats, Barley 


- Which one of the crops below grows best in clay soil? 


Corn, Wheat, Potatoes, Oats, Barley 


- At what age does an apple tree begin to bear fruit? 


1-2 Years old; 2-3 Years; 3-4 Years; 4-5 Years 


* How long will an apple tree bear if propery, cared for? 


5-10 Years; 10-15 Years; 15-20 Years; 20-25 Years; 25-30 Years 


At what age does a peach tree begin to bear fruit? 
lYearold; 2 prece 3 Years; 4 Years; 5 Years 


What time of year are trees pruned? 
Jan-Feb; Миа May-June; June-July; July-Aug; Sep-Oct. 


454 Austin E. Grigg 


17. What is average milking life of a dairy cow? 
3-5 Years; 7-9 Years; 11-13 Years; 15-17 Years 4 


18. What is average number of months a dairy cow can be milked each уе 
4-5 months; 6-7 months; 8-9 months; 10-11 months 2 


19. Which type of cow produces the most milk? , 
Jerseys; Holsteins; Guernseys; Hereford 


20. Corn which has been cut up and run into a silo and permitted to 
for feeding dairy cows is called: 
Lespedeza; Ensilage; Masher-meal; Alfalfa 


21. How old should a calf be before butchering for veal for human consumption ? 
6-9 weeks; 10-12 weeks; 13-16 weeks; 17-20 weeks 

22. After baing ed how long will it take before а cow has a calf? 
4 months; 5 months; 6 months; 7 months; 8 months; 9 m 

23. What is daily milk yield of average cow? 
1-2 gallons; 2-3 gallons; 3-4 gallons; 4-5 gallons; ' 5-6 gallons - 

24. A herd of cattle free from disease is known as: ~ 
Inspected herd; Grade A herd; Commercial herd; Stump herd | 

25. How much wool can you get from the average sheep at each shearing? | 
1-21bs; 3-41bs; 5-61bs; 6-7 lbs; 7-81bs; 8-9 lbs. 

26. When are sheep sheared? 

| Магеһ; April; May; June; July; August; September 

27. How many pigs come per litter, usually? 
3-5; 6-10; 12-15; 16-18; 20-25 

28. What time of year do hens lay best? 
Spring; Summer; Fall; Winter 

29. What time of year do fowl molt? 

Д Spring; Summer; Fall; Winter 
30. When does a dry spell hurt farming most? 


е Mar-Apr; May-June; July-Aug; Sept-Oct; 
ес. 


Statistical Results 


Quite naturally, the test is able to discriminate those with farm | 

' grounds from those who have had no farm experience. When appli 

a sample of 79 white adult prisoners of rural background and to 22 wh 

adult prisoners of urban background, the test yielded a biserial correla 

of ЛЗ. Mean score for the rural groups was 12.4, sigma 4.7. 
score for the urban groups was 5.8, sigma 5.7. 

The test does not correlate well with number of years of farm exr 
once, however. When the product-moment correlation was compu 
for rural sample for score on the Farm Knowledge Test vs. number 
years residence on a farm, the correlation was found to be .31 for 79 
Failure of the test to correlate highly with number of years of farm. 


E 


A Farm Knowledge Test 455 


perience was believed at first to be because of possible correlation between 
intelligence and farm test score,—the hypothesis being that more alert 
individuals could grasp the knowledge required by the test within a 
relatively short time, whereas duller individuals might spend years on a 
farm without learning some of the knowledge required by the test. When 
the scores on the Farm Test were correlated with those made on the 
Revised Beta, a non-verbal group test of intelligence, however, the re- 
sulting correlation was insignificant: .15. 

It is now believed that the test correlates poorly with years of farm 
experience because the facts required on the test sample breadth of farm 
experience rather than length of farm learning: different types of farming 
rather than how deeply any specific type has been practiced and mastered. 

Test-retest reliability for the 79 rural cases was found to be .94 after 
two weeks interval. 

Summary 

1. A test of Farm Knowledge has been described and the need for 
such a test in certain classification programs has been pointed out. 

2. The test does not correlate significantly with number of years of 
experience on the farm and this is believed to be because the test samples 
breadth of experience rather than depth of learning and accomplishment. 

3. As was expected, the test is able to discriminate between those with 
. rural backgrounds from those with urban backgrounds. : 

4. The test has proved useful in discriminating the experience range 
among individuals of rural backgrounds, and has been found practical 
in prison and pre-parole classification work. 

Received March 10, 1948. 


Creating Factor Comparison Key Scales 
by the Per Cent Method 


Edward N. Hay 
Edward N. Hay and Associates, Inc., Philadelphia 


The first step in installing a plan of Factor Comparison job evaluation 
is to create the Key Factor Scales by means of which the jobs are evalu- 
ated. Some characteristics of the Factor Comparison method are: 


1. The factor scales are created from about a dozen Key Jobs, 
selected for the purpose. 

2. Not more than Three to Six Factors are used, not subdivided 
(2, 4). 

3. The values assigned to the job factors on the factor scales are de- 
rived from Judgments of these key jobs, expressed by the committee 
members in the process of comparing one job factor with another (1, 7). 

4. Evaluation of jobs by Factor Comparison is accomplished by 
Comparing the Factors of all jobs to be evaluated with those of the key 
jobs on the Factor Scales (1, 9). 

5. Job evaluation by any method is dependent on the Judgment of 
the evaluators. In Point methods much, if not most, of the judgment 
is in the factors; their selection, definition, pointing, and weighing. In 
Factor Comparison the judgment is less in these things, and more in the 
comparison of factors of jobs being evaluated. 

6. The Reliability of evaluation is increased by Pooling the judgments 
of a number of qualified and trained persons. 

7. Experience shows that it is not difficult to make factor comparisons, 
and that the statistical Reliability of such estimates, reached independ- 
ently by a number of judges, is consistently above .90 (8). 

8. It has been noted that Comparison of job factors follows Weber's 
Law, in that an observable difference between job factors is expressed M 
а ratio of that difference to the magnitude of one of these factors. This 
“difference limen" has been found to be 15% (3, 5, 6). 


The Faetor Comparison method was originated by Eugene J. Benge 

(1) about 1926. His method of creating the Factor Scales makes use of 

the "fair and going rates" of the jobs used as key jobs. "This feature of 

using salary or wage values as a means of deriving point values for the 

factors of the key jobs is not properly understood by a good many user? 
456 


Creating Factor Comparison Key Scales 457 


of other job evaluation plans. There is no doubt, however, of its entire 
practicability when used correctly. It has been used for all kinds of jobs, 
including executive ones to $50,000 a year. Nevertheless, there are 
situations in which it is inadvisable to rely on salary or wage values of the 
key jobs. This is most often for reasons of strategy or policy, such as 
when a joint Union-Management committee is doing the evaluation. 


Тће Per Cent Method 


To meet this situation the Per Cent Method of creating factor key 
scales was devised by William D. Turner (7, 10). It does not depend 
on salary or wage values but on an ingenious method of combining two 
sets of evaluators' judgments about the key jobs; 


1. The magnitude of each “factor” of a job in relation to the other 
factors in that job, and 

2. The relative magnitude of a factor in one job to the same factor in 
the other key jobs. 


All eight characteristics of Factor Comparison job evaluation which 
were listed are made use of by the Per Cent Method. It is being used 
successfully by a management consulting organization, most recently for 
a salaried group of 2100 persons, so that it is in no way experimental. 


A New Procedure for the Per Cent Method 


The procedure described here for developing the factor scales by the 
Per Cent Method is shorter than the one devised by Turner (10) and gives 
similar results. Occasionally there is a difference of a small per cent. 
The new method is best understood by following the steps shown here- 
after. Steps shown in Tables 1 and 2 are the same as in the original Benge 
method (1). 

The figures in Table 1 are read from top to bottom and show the com- 
mittee's opinion of the order of rank of the three jobs by factors. That 
is, job Мо. 2 is thought to require the most skill, job No. 1 the next most, 
and job No. 3 the least skill, and similarly for the other two factors. 


Table 1 
Rank of Jobs by Factor 


———_ == = __=——=+ 


Јођ Мо. Skill Decisions Responsibility 


© 
о ы ~ 


458 ; Edward У. Hay 


In Table 2 the three factors within each job have been ranked by the 
committee with relation to one another (read across). 


Table 2 
Rank of Factors in Jobs 
Job No. Skill Decisions Responsibility 
H 2 E] 1 
2 1 E] 2 
3 1 3 2 


The rank orders of Tables 1 and 2 are now converted to percentage 
relationships. This is done by assuming that the job factor ranked highest 
in value is 100%. The group of judges then agrees on the size of the 
others in relation to that one, expressing their judgments as percentages 
of the value of the highest ranked one. For example, in Table 3 job No. 2, 
being ranked highest in value in Skill, is 100%; while job No. 1 is con- 
sidered by the judges as being about three-quarters as much and so is 
placed at 75%. 


Table 3 
Factor 95 Ratings 
(Read down) 
Job No. Skill Decision Responsibility 
1 759% 100% 100% 
2 100 100 85 
3 50 33 25 
225 233 210 
EPRI REA UU. Juris ot OI 
Table 4 
Job % Ratings 
у (Read across) 
————ЄЄ—Є—Є———Є————ЄЄ—————— 
Јођ Мо. ШЕЛ Decision Responsibility 
1 85% 40% 100% = 225 
2 100 40 7 = 215 
3 100 25 75 = 200 


When job factors are far apart in size, they cannot be compared as 
accurately as when they are of nearly the same size. A familiar illustra- 
tion is the attempt to judge the height of two men, one of whom is stand- 
ing beside you and the other standing 100 feet away in an open Space 
Gross differences are noticeable at a distance but not small ones. A 


Creating Factor Comparison Key Scales 459 


curacy in estimating the size of differences is less than if the two men, or 
other objects being judged, are side by side. No attempt has yet/been 
made to determine experimentally the accuracy with which job factors of 
widely differing difficulty values can be estimated. If traditionally the 
- job with the highest difficulty value in a group of jobs is about four times 
the lowest one, it can be presumed that differences between factors of such 
jobs will not be much more. This is a useful guide in estimating the 
probable range of factor values from high to low. In view of the re- 
levancy of Weber's Law, it is obvious that observable per cent differences 
between job factors must necessarily be about 15%. 

Tables 5 and 6 are the same as Tables 2 and 4, except that all values 
have been shrunk so that totals will add to 100. 


Table 5 
F% Values 
(Read down) 
Job No. Skill Decision Responsibility 
1 33 43 48 
2 ^ 45 43 40 
3 22 4 12 
100 100 100 
а ТОТ а зад a 
Table 6 | 
J% Values 
(Read across) 
a —____——————————————————————— 
Job No. Skill Decision Responsibility 
1 38 17 45 = 100 
46 19 35 = 100 
50 12 38 = 100 


E c 739 о тобо a ааваа E 

Table 7 is constructed by dividing each cell of Table 5 by thé cor- 
responding cell of Table 6. This operation is the key to the Per Cent 
Method, because by it the relative total values of the jobs are determined. 
In fact, two sets of total job values may be extracted from Table 7: 


1. The totals of the rows of Table 7 give the relative job value totals 
derived from one set of committee judgments; namely, the percentages 
of all factors to their jobs as in Table 6. i 

2. The reciprocals of the totals of the columns give the relative factor 
value totals which were derived from the other set of committee judg- 
Ments; namely the percentages of job factors to the total value of each 
of the three factors in Table 5. 


460 Edward М. Hay 


Table 7 
F%/J% Ratios 
Job No. Skill Decisions Responsibility 

1 87 2.53 107 

2 98 2.26 114 

3 44 1.17 31 

2.29 5.58 2.52 
Reciprocals 437 168 307 
Reciprocals, adjusted 470 .180 426 


Begin Table 8 by copying from Table 7 the row (job) totals. D 
ute these totals among the factors in the same percentages chosen by thi 
evaluating committee in Table 6. In performing this operation, mult 
each value by 100, in order to avoid using decimals. Any multiplie 
would do, because the job totals whould still be relatively the same. | 


Table 8 
Totals Distributed Per Tables 5 and 5 X 100 and X 1000 


Skill Decisions Responsibility 
Job No. ат Ed Е J 

1 170 76 201 = 
155 77 205 

2 195 87 153 = 
210 77 170 

3 96 23 73 = 
105 26 51 
470 180 427 = 


Next, take the reciprocals of the factor column totals from Table 
multiplied by 1000, and by the fraction 1077/1002, so that the grand 
of the row (job) totals in Table 8 will be the same as the grand total I 
column (factor) totals—1077. This step is essential to permit dire 
comparison of the two sets of committee judgments, which oth 
would not be to the same scale. Now enter them in the correspondi 
positions in Table 8. Distribute each of these three column totals an 
the jobs in the proper column according to the percentages ОГ 
selected by the committee and shown in Table 5. Table 8 now com 
two sets of values of each factor of each job; one set derived from 
opinions of the committee as to the size of each factor of each job in re? 


Creating Factor Comparison Key Scales 401 


tion to the other factors in the job; and the other derived from the opinions 
of the committee as to the size of each factor of each job in relation to the 
size of the same factor in the other jobs. Any differences between pairs 
of values in Table 8 are the result of variation in the two sets of opinions 
of the committee. It is now the problem of the committee of evaluators 
to decide what value to use, when there is a difference between the two. 
In cells 1-D and 1-R the differences are under 2%; in cells 1-8, 2-8, and 
3-8, the differences are less than 10% of the larger value; and in cells 
2-D, 2-R and 3-D the differences are between 10% and 12%. In only 
one cell is there a larger difference; 30% in cell 3-R. 


Application of Weber's Law to the Key Scales 


Experimental work (3, 5, 6) has shown that the “difference limen" or 
threshold of perception of differences in Factor Comparison job evalu- 
ation is 15%. Differences of this magnitude can be perceived about 75 
of the time by an experienced committee composed of qualified and well- 
trained individuals (8). It follows that there is no point in using evalu- 
ating scales which provide finer distinctions than about 15%. It is in 
order, therefore, to adjust the values in Table 8 to conform with differ- 
ences of 15%. It is convenient to use 100 as a base, particularly since 
100 was the base for the per cent judgments of Tables 3 and 4. Table 9 
gives 15% intervals from 100. 

Table 9 
15% Intervals Based on 100 


——— Ы-—А———— 
200 100 50 25 12 
174 87 43 22 10 
152 76 38 19 9 
132 66 33 16 8 
115 57 29 14 7 


J eee 
The values have been rounded. It will be noticed that they double 

every five steps, a convenience in evaluation work. In deciding what 

values to use for the final factor scales there are three logical choices: 


1. Adopt the factor value derived from the row (job) totals. 
2. Take those derived from the column (factor) totals. 
3. Use the average of the row and column factor values. 


Experimentally and logically the column relations of Table 5 (job to 
Job by factor) we sane Regen and therefore more validly and reliably 
Judged than the row (factor to factor by job) relations of Table 6. The 
latter set of relationships seems to evaluators to be somewhat artificial. 
Table 10 shows the final factor scales which can now be used for evaluat- 


462 | Edward N. Hay 


ing all other jobs. In the illustration used here there are only three jobs, 
for convenience and simplicity. In actual situations there would be from 
` 10 to 20 key jobs. 


Table 10 
Factor Scales 


Factor Values loa 
(15% Intervals) Skill Decisions Responsibility 


200 Job No. 2 t Job No. 1 
174 Job No. 2 
152 Job No. 1 


100 Job No. 3 


Job No. 1 and No. 2 
Job No. 3 


Job No. 3 


Proof That Row Totals Correspond With Job Totals 


Let S, be the sum total of skill of all jobs. 
D, and Р, are respectively the Decisions and Responsibility of all jobs. 
S, is the skill of job a. 
Ja is the value of joB a and = S, + D, + Ra 
J+ is the total value of all jobs and = S, + D, + Р, 
= Ја Js + Jet Jw 
РРА ИЛИЈА 


1. Each cell of the first column of Table 5 is an F% and is emet 


aee Бенси 


2. Each cell of the first column of Table 6 is a Ј% and is Баага, 


3. Table 7 із F%/J% = ea t 5 x ze » g in row 1, col. 1. 
a, a t a 

Ja 

К? 


4. Add horizontal values, + ve + for the first row. 
t t 


Creating Factor Comparison Key Scales 463 


(р.к. + SR, + 8,00 
SDR, 

6. If these row totals are now added, the grand total at the foot of the ` 

column is, substituting X for the fraction in 5 above, 


JaX + IX + IX +++ ЈАХ = ЈА. 


The value of the fraction, “X,” may be cancelled from both sides of 
the equation. The result is 
7. Ja + Jo + Je. +++ Jn = Јь which is true by definition. 


Proof That Reciprocals of Column Totals Correspond 
With Factor Totals 


In dealing with the correct relation between the factors, it is necessary . 
to use the reciprocals of the totals of the columns for Skill, Decisions and 
Responsibility as will be apparent from the following steps, which follow 
after step 3 of the proof relating to the horizontal (job) values: 
4. If the values in the first column of cells (the Skill column) are added, 
Jn 

ET 

5. But Ja + Ju + Ј, + +--+ Jn = Је 


6. "Therefore the fraction in 4 equals Y 


i S 
7. The reciprocal of this final fraction is, of course, TA 


5. This fraction can be reduced to the form, Ја 


SUP ШИЛГИ eee 
the result is + 8, + 3 


8. The reciprocals of the totals of the Decisions and Responsibility col- 
| D R 
umns will similarly be, respectively, F and vy 


9. If these three column totals are added the result will be 


үшүй SE Dit B, 
Ja EE Ji 
10. But, by definition, S, + Di + Ви = Ju i 


А р. + R 
11. 80, the fraction in step 9, which is St 7, is ^ 


can also be written S, + Di + Ri = Је 


Summary 


Best methods of job evaluation rest on factor scales which are used to 
| evaluate jobs by comparison. Point methods use a priori scales and 
| Factor Comparison methods use scales developed for each installation by 
an evaluating committee. The original Benge method of evaluation, 
Which has since become known as Factor Comparison, uses scales de- 


464 Edward У. Hay 


veloped by judgments of the committee members applied to a few 
jobs," values for which are derived from the going wage or salary r t 
* the key jobs. Another way of developing the key factor scales is k 
as the Per Cent Method, which does not make use of money values. 
new and short version of the Per Cent Method is described here, w 
should have application whenever policy or strategy forbids the us 
money values for key jobs. It is particularly applicable to high 
positions; those from about $5000 to $50,000 or more. Its successfu 
depends on two things; (1) Careful selection of the evaluating gr 
preferably with psychological aids, and (2) Sound and thorough trai 
of the evaluators, having in mind at all times the principles of psy 
logical measurement. - 

` Received July 7, 1948. 
References 


1. Benge, E. J., Burk, S. L. H., and Hay, E. N., Manual of job evaluation., New Y 
Harper & Bros., 1941. ; (ay 
2. Definitions of Factors—unpublished manuscript by Edward N. Hay and АШ 
3. on SS пи Регану. New York; D. App 
., 1941. 
4. Standards of value for executive compensation, privately published manu 
Edward N. Hay. N 

5. Hay, Edward N., Characteristics of factor comparison job evaluation, Per 
1946, 22, No. 6. . 
6. Hay, Edward N., Psychophysical phenomena in job evaluation (to be publi 
as Hay, Edward N., Four methods of establishing factor scales in factor со 
job evaluation, Personnel, 1946, 23, No. 2. E 
8. Hay, Edward N., Reliability of job evaluation by the factor comparison metho 
be published). eg 
9. Hay Edward N., Training the evaluation committee in factor comparison job ev 
ation (to be published). МИ 

10. licct William D., The per cent method of job evaluation, Personnel, 1$ 
o. 6. 


Reliability and Comparability of 
Different Job Evaluation Systems * 


David J. Chesler 
Personnel Research Institute, Western Reserve University 


The purpose of this investigation was twofold: (1) To study the 
reliability of a job evaluation manual as determined by a comparison of 
ratings made by independent raters evaluating the same jobs; and (2), to 
determine the degree to which different types of job evaluation systems 
give the same results. These two problems are treated together in one 
report, rather than separately, because the experimental procedures 
followed for both problems were so closely related, and because the re- 
sults of either problem have greater meaning when compared with those 
of the other. 

Previous Research on. Reliability in Job Evaluation. The problem of 
reliability in job evaluation is relatively simple from the viewpoint of 
experimental design, but difficult to attack practically because of the in- 
conveniences involved in obtaining the services of experienced job evalu- 
ators in industry to act as raters. It is probably for this reason that very 
little research has been done on the problem. 

Lawshe (3) presents some data concerning reliability. The same 
group of raters rated the same five jobs on three different occasions. The 
smallest fluctuation for any one job was 25 points and the greatest 
fluctuation was 100 points; the average fluctuation was 71 points. In 

* A condensation of a portion of a Ph.D. thesis submitted to the Graduate School of 
Western Reserve University in 1948. The writer is greatly indebted to Jay L. Otis, 
Director of the Personnel Research Institute of Western Reserve University, for direetion 
and encouragement in this study. This study would have been impossible without м 
efforts of certain individuals engaged in job evaluation work in several organizations in 
Cleveland, Ohio. That these individuals found time in the midst of busy schedules to 
Perform the arduous task of rating а group of jobs according to one or more job evaluation 
Systems is tribute to them and to the organizations with which they are affiliated. A 
Special debt of thanks, therefore, is due to: Richard H. Rice, American Greeting Pub- 
ishers, Incorporated: Arthur S. Hann, Central National Bank of Cleveland; Clyde L. 

mer and Frederick W. Becker, Cleveland Electric Illuminating Company; George Н. 

Thobaben, Cleveland Graphite Bronze Company; Norman B. Bradley and Charles A. 

„Низак, East Ohio Gas Company; Elwood V. Denton, Federal Reserve Bank of Cleve- 

d; Clark С. Sorenson and Richard C. Hoff, Harris-Seybold Company; Joseph J. 

erban, National Screw and Manufacturing Company; Sterling T. Apthorp, Wayne 

R. Flight, and Frederick A. Castle, Standard Oil Company of Ohio; and Madeline L. 
ngenecker, Warner and Swasey Company. 
465 


466 David J. Chesler 


terms of cents per hour this was equivalent to a low of $.02 and а high 
of $.10, with an average of $.05. 

In a later study Lawshe and Wilson (5) made a comparison of the 
reliabilities of a long and an abbreviated scale. This study involved 40 
jobs and 20 job analysts. Each analyst rated 20 of the 40 jobs, although 
no two raters rated the same 20 jobs either under the long or the short 
system. However, each consecutive pair of judges, between them, rated 
all 40 jobs and were treated statistically as “опе man." Under these 
. conditions reliabilities of .77 for the long system and of .89 for the short. 
system were obtained. Application of the Spearman-Brown formula to 
obtain an estimate of the correlation between the "pooled ratings" of two 
groups of 5 judges each yielded correlations of .94 and .98 for the long and 
short systems respectively. 

Ash (1) recently made a study of factor reliability in which various 
job analysts ranked a group of jobs for each of nine factors. The correla- 
tions between the rankings of each analyst on each factor and the “median 
array” for each factor ranged from .25 to .98. 

It would seem that in studying reliability, job evaluators should follow 
the methods generally followed in experimental and industrial psychology- 
This would mean that there would be two approaches to the problem. 
One of these might be called the rate-rerate method, which would be 
analogous to the “test-retest” method in psychological testing. The 
other method involves a comparison of the results of independent raters 
or groups of raters in making the original ratings. In both of these 
methods it is desirable that the same jobs be rated by all the analysts oF 
groups of analysts participating. It seems desirable also that one report 
simple straightforward product moment correlations between point 
ratings assigned either by the same individual or groups on two different 
occasions, or by different individuals or groups operating independently. 

Previous Research on Comparability of Different J ob Evaluation Systems. 
Very few, if any, published studies can be found which report i 
comparisons of two or more job evaluation systems, although various 
authors do not hesitate to cite the advantages of and imply greater 
validity in various types of system. — Viteles (6) recognized this situation, 
and stated it well: *. . . certain of the issues can be so easily settled by 
a direct comparison, under controlled conditions, of results. . - + Such 
a comparison would give valuable information of the time taken to ође | 
tain such results, of the effort and money expended, and of the reliability; 
validity, and usefulness of the final product. However, in spite of one of 
Pis pe in this direction, definitive experiments still remain to d 
made." 

Otis and Leukart (5) have indicated the method to be followed 


Job Evaluation Systems 467 


determining the differences among results obtained with different systema: 
"To date there has not been enough research on the application of differ- 
ent methods of job evaluation to the same jobs. Such comparisons should 
be made to determine if approximately the same classifications would be 
obtained when the same jobs are evaluated by different methods.” 

As may be inferred from the foregoing, one of the objectives of the 
present study was to make a direct comparison of different job evaluation 
systems in use in several organizations. 


Method 


The essential feature in the methodology of the present study was to 
keep certain variables constant under certain conditions. In job evalu- 
ation the variables are the jobs, the raters, and the job evaluation 
manuals, In the present study, the jobs were constant throughout, and 
are referred to as the “standard jobs.” Another constant in certain parts 
of the study was a point rating type of job evaluation manual, referred to 
as the “standard manual,” to distinguish it from the various manuals 
actually in use in the organizations which participated. And finally, 
certain comparisons have been made in which the raters were kept 
constant. 

The experimental procedure was comparatively simple. In essence, 
job analysts on the staffs of several industrial and commercial organiza- 
tions were asked to rate a set of standard jobs on the standard manual 
and on their own company manuals. f 

The Standard Jobs. Job descriptions and specifications for 35 clerical, 
administrative, and supervisory jobs were taken from the files of а large 
commercial organization. The standard jobs, therefore, were jobs that 
actually existed in a going organization; they were not hypothetical or 
“ideal” jobs devised for the experiment. The job descriptions and 
specifications were taken practically intact, and only very minor modifica- 
tions and revisions were made for the purpose of disguising the identity 
of the organization in which the jobs existed. A few minor editing 
changes were made to clarify duties or specifications of some of the jobs. 

The 35 standard jobs were identified by number and name as follows: 
1. Addressograph operator; 2. Assistant auditor; 3. Assistant personnel 
manager; 4. Assistant purchasing agent; 5. Bookkeeping machine opera- 
tor; 6. Building manager; 7. Building office stenographer; 8. Chief tele- 
phone operator; 9. Employment interviewer and women's counsellor; 
10. Expense accounting paying clerk; 11. Expense accounting report 

` clerk; 12. Expense audit clerk; 13. Head of payroll division; 14. Key 
Punch operator; 15. Mail department clerk-typist; 16. Messenger; 17. 
Payroll clerk; 18. Payroll clerk-typist; 19. Payroll record clerk; 20. 


468 : David J. Chesler 


Personnel clerk; 21. Personnel clerk-typist; 22. Photostat machine 
operator; 23. Recordak operator; 24. Registered nurse; 25. Secretary to 
auditor; 20. Senior general files clerk; 27. Senior IBM clerk; 28. Senior 
mail clerk; 29. Senior multigraph operator; 30. Sorting machine ope 
tor; 31. Stockkeeper and receiving clerk; 32. Supervisor of tabulati 
33. Tabulating machine operator; 34. Telephone operator; and 357 
Teletype operator. 
These particular jobs were selected because they represent a wid 
range of job difficulty and because they are typical of salaried jobs ii 
many industrial and commercial organizations. The company in which 
these jobs existed had approximately 500 job classifications and a v 
plan with 15 labor grades. The 35 standard jobs were distributed 
labor grades 1 to 14 as shown in Table 1. 


Table 1 
Distribution of 35 Standard Jobs According to Labor Grade 


Labor Grade 
МИА 7 8^9 10 1 J9 18 14 
No. of Jobs ZG 570. 4:4 8 1 1 2 1 1 


It was the feeling of the chief job analyst in the organization from 
which the job descriptions were taken that this distribution was typical 
of the distribution of all the jobs in this organization. i 

The job specification for each standard job was divided into 1 
categories in order to key in with the 12 rating scale items of the standard: 
manual described below. | 

The Standard Manual. The standard manual was a fairly typical 
point rating manual with 12 factors. The 12 factors are listed in Table 
2. The first column indicates the relative percentage points for oF 
numerical weight of each factor. These weights were arrived at as the 
result of two years of work in the Personnel Research Institute of Western 
Reserve University with factors identical or similar to those listed above. 
The weights represent the concensus of dozens of individuals with varying 
amounts and types of personnel experience. a 

Participating Companies. A total of nine independent private ш- 
dustrial firms participated in this study. In this report these companie J 
are referred to as Со. A, Co. B, ete. These code letters were assigned 10 
the companies in the order in which they submitted any raw data. 

1 This manual was the most recent form of a manual which had gone through severe 
stages of development in the Personnel Research Institute of Western Reserve Univer 
ity. It was very much like the manual in use in the organization which contributed tht 
standard jobs. р 


Job Evaluation Systema 469 


Table 2 | 
Point Values Assigned to Each Degree of Standard Manual Factors 
k ———=—— 
Level 
Item 1 2 3 4 5 6 
__________________------ 
1. Work experience 21 38 п 88 105 
2, Essential knowledge and training т з ИС: о. Т 85 
3. Dexterity 4 9 15 20 
4. Character of supervision received о 20 30 4 о» 
5. Character of supervision given n 22 з и 5 
6. Number supervised тз 5 ми 2 35 
7. Responsibility for funds, securities, 
and other valuables 6 11 6 20 5 30 
8. Responsibility for confidential matters 6 18 30 
9. Responsibility for getting along with 
others 6 14 2 з 
10. Responsibility for accuracy—effect 
of errors 6 12 з и 23 
11. Pressure of work 4 9 15 2 
12. Unusual working conditions 2 6 10 


The Raters, The number of raters in each company ranged from one 
to three. All of the raters were well trained and experienced job analysts 
who devoted most of their time in their respective organizations to job 
evaluation work. : 

The Company Job Evaluation Manuals. In addition to rating the 
standard jobs on the standard manual, raters in six of the nine companies 
also rated the jobs on their own eompany manuals. A brief description 
of each of these company manuals follows: 


Co. A—A factor comparison system practically identical to that, pre- 
sented by Benge, Burk, and Hay (2). This system contains five 
factors: (1) Mental effort; (2) Skill; (3) Physical effort; (4) Re- 
sponsibility; and (5) Working conditions. 

Co. B—A point rating system with 15 factors. 

Co. C—A factor comparison system similar in all respects to the system. 
of Co. A, except for one factor, “supervision,” which was present 
in place of “working conditions.” 

- Co. F—A point rating system with 13 factors. 

Co. G—A point rating system with 15 factors. 5 

Co. I—This company had по formal job evaluation plan as such, but its 
System might be described as а combination ranking and grade 
classification method. The jobs in this company were classified 
into established labor grades. Any new job that was created was 


470 _ David J. Chesler 


compared with similar jobs already classified to determine 
particular labor grade in which it should fall. It was then g Y 
rank within the labor grade and а salary assigned to it somewh 
between the salaries of the job above it and the job below it. 


Experimental Procedure. Job analysts in each company were asked 
to rate the 35 standard jobs on the standard manual and on their ¢ 
company manuals. The specific instructions to the raters were as follo 
“1. Rate the 35 standard jobs on the standard manual. Rate the job 
as described, not as such jobs may happen to exist in your organization.” 
“2. Rate the 35 standard jobs on your company manual. Again, rate 
jobs as they are described, not as such jobs may happen to exist in you 
organization." 

The raters were not informed of the point values assigned to th 
factors and degree levels for the standard manual. This was done 
order to control as much as possible the bias which might operate if the 
factor weights were known. : 

The raters were cautioned to do step 2 of the instructions as indepen 
ently of step 1 as possible; in other words, they were told not to alle 
their judgment while using the standard manual to influence their judg- 
ment while using their own company manuals. As a matter of fact, of | 
the six companies which completed step 2, four completed step 1 anc 
submitted the data before beginning step 2. | 

One minor problem which it was felt might present itself in rating the | 
standard jobs on the company manuals was the fact that the job specifica- 
tions were expressed in terms of the factors which comprised the standard | 

manual. However, all of the raters reported that they had no difficulty” 
in rating the standard jobs on their own manuals. The reason for this is | 
probably the fact that the job descriptions were very detailed and. 
thorough. А 

The total point values for the standard jobs, both on the standard. 
manual and the various company manuals, together with the point values | 
for individual factors in the standard manual constituted the raw dati 
for the present study. 


Results and Interpretation 


Reliability of the Standard Manual. Data from the first seven com- 
panies in which job analysts rated the standard jobs on the standar 
manual were utilized for determining reliability of the manual. Ti 
3 shows the product moment intercorrelations among these job analy 
These reliability coefficients range from .93 to .99 with an average of 97. 
These coefficients seem unusually high, but upon further consideration 
the results are not unexpected. In the opinion of the writer, the key 1 

• 


Job Evaluation Systems E 


the magnitude of these coefficients lies in the thoroughness and detailed 
nature of the job descriptions and specifications of the standard jobs. 
The data in Table 3 indicate the degree of reliability, as measured by the 
correlation coefficient, that can be attained with a carefully constructed 
job evaluation manual, thorough and detailed job analyses, and experi- 
enced job evaluators. 
Table 3 
Reliability of the Standard Manual: Intercorrelations of Raters in Seven 
Companies Who Rated the Same Jobs on the 
Same Job Evaluation Manual 


Co. А Co. B Co. C Co. D Co. E Co, F Co. G 

Co. А 95 98 98 95 97 59 
Со. В 95 96 95 9з 95 95 
Со. С 98 96 98 96 99 99 
Co. D .98 95 98 .96 98 98 
Со.Е 95 .93 .96 .96 6 95 
Со.Е 97 .95 .99 98 .96 97 
Co.G 99 .95 .99 98 95 97 


The correlation coefficient is only one index of reliability. Another, 
and perhaps more practical, index is the range of fluctuation of point 
ratings for each job among the seven different raters. For the 35 stand- 
ard jobs the fluctuations in point ratings ranged from 9 to 106 points with 
an average of 37.9 points. The fluctuations in point ratings may be 
analyzed in terms of labor grades, as shown in Table 4, which is based 
upon equal labor grades of 25 pointseach. The fluctuation among seven 
sets of ratings of 35 jobs was the value of 1.0 labor grade (25 points) or 


Table 4 


Fluctuations in Terms of Labor Grades Among Seven Raters 
Who Rated the Same 35 Jobs on the Same Job 


Evaluation Manual 
Жа == ———— 
Labor 

ў 1 2.9 2.9 

Pe 6 17.1 20.0 
1.01-1.5 13 37.1 57.1 
1.51-2.0 10 28.5 85.6 
2.01-2.5 2 5.7 91.3 
2.51-3.0 1 2.9 94.2 
3.01-3.5 1 2.9 97.1 
3.51-4.0 0 0 971 
4.01-4.5 1 2.9 100.0 


а 40146, ЧУК з eee UMEN 


472 | David Ј. Chesler ' 


Table 5 


Correlations Between Each Factor of Standard Manual and Total Score 
on Standard Manual for 35 Standard Jobs 


Factor Co. A Со. В Со. С ` Ва 
1. Work experience 89 .90 89 01 
2. Essential knowledge and training 76 70 17 07 
3. Dexterity —.06 —.19 – 11 13 
4. Character of supervision received 85 81 90 .09 
5 Character of supervision given | 81 82 83 02 
6. Number supervised 72 72 77 05 
7. Responsibility for funds, securities, 

and other valuables a .23 21 81 10 
8. Responsibility for confidential matters 68 | 70 .57 13 
9. Responsibility for getting along with 
others 64 63 64 01 
10. Responsibility for accuracy—effect 
of errors 48 65 76 18 

11. Pressure of work .06 49 52 46 

12. Unusual working conditions —.25 —.30 —.26 05 


less for 20 per cent of the jobs, and the value of 1.5 labor grades (37.5 
points) or less for 57.1 per cent of the jobs. It is difficult to form a judg- 
ment as to whether these results are “good” or “not so good,” in view of 
the paucity of previous research on the problem. The findings do 
emphasize, however, the different impressions of reliability that may be 
obtained from correlation coefficients and from fluctuations among raters. 

Relationship Between Each Factor of Standard Manual and Total Score. 
The data from three of the participating companies were analyzed to 
determine the relationship between each factor of the standard manual 
and total score on the standard manual. "Table 5 shows the correlations 
between each factor and total score and the range of these correlations 
for each factor. It is believed that the magnitude of the range is an 


indication of the reliability of each factor. The ranges in Table 5 vary | 


from .01 for “work experience” to .46 for “pressure of work." However, 
disregarding the latter would give a variability of .01 to .13. . 

The relatively high range for “pressure of work" would seem to indi- 
cate that there was a difference in interpretation of this factor between 
Co. À on the one hand and companies B and C on the other. 

The negative correlations for "unusual working conditions" and 
“dexterity” indicate that for salaried jobs in general jobs with better 
working conditions and requiring less manual dexterity are paid better 
wages than jobs with poor working conditions and requiring more manual 
dexterity. 


Job Evaluation Systems А 478 


Comparison ој Company Job Evaluation Manuals. The product mo- 
ment intercorrelations of six company manuals as used by raters or 
groups of raters in each company to evaluate the standard jobs are shown 
in Table 6. These correlations range from .89 to .97 with a mean of .94. 


"Table 6 
Intercorrelations of Six Different Company Manuals Used to Rate 
35 Standard Jobs 

———————— 

Co. А Co. B Co. C Co.F Co. G Co.I 
SSS eee 
Co. A 91 97 95 95 89 
Со. В 91 .95 .94 .94 91 
Co. C 97 .95 р .95 97 .93 
Co. F 95 94 .95 95 .89 
Co. G .95 .94 97 .95 94 
Со.1 .89 91 .93 89 94 
E 2 oo 


They are almost as high as the reliability coefficients reported in Table 3 
` for the standard manual, and indicate that various types of job evaluation 
systems classify jobs very much the same. 
‘As in the case of the reliability coefficients discussed above, the high 
correlations among the different company manuals are probably due 
primarily to the thoroughness and detail of the standard job descriptions 
and specifications, although here again, the competence of the job 
evaluators who participated undoubtedly contributed. 

The practical import of these findings is that the particular type of 
system used in an organization is not nearly so important as the integrity | 
and accuracy with which it is installed, policed, and maintained. If 
most systems yield generally the same results, obviously the problem of 
deciding upon a system to adopt boils down to questions of time, ease of 
understanding on the part of all individuals concerned, and ease of in- 
stallation and maintenance. 

Comparison of the Standard Manual and the Company Manuals. The 
correlations between the standard manual and the company manual in 
each of six companies were as follows: Co. A, .91; Co. B, .88; Co. C, .95; 
Co. F, .90; Co. G, .95; Co. I, .99. X 

Тће range and magnitudes of these correlations are very much in line 
With those reported above among different company manuals. Since the 
Standard manual may be thought of as “another company manual” which 
Was tried out in several places on a set of standard jobs along with other 
company manuals, these findings with respect to range and magnitude of 
correlation coefficients were expected. 1 the correlations above had been 
lower than those reported in Table 6, then there would have been reason 
to believe that there was something unusual about the standard manual. 


474 j David J. Chesler 


Specifically this would mean that there was comparatively less common- 
alty between the standard manual on the one hand and the company 
manuals on the other. However, the results indicate that this was not 
the case. 


Summary and Conclusions 


1. The basic methodological feature of the study was to have raters in 
various companies evaluate a standard set of job descriptions and speci- 
fications for 35 representative salaried jobs on a standard job evaluation 
manual and on their own respective company manuals. The standard 
manual was of the point-rating type and contained 12 factors. 

2. The reliability of the standard manual was determined by compar- 
ing the results of independent raters in making original ratings. Inter- 
rater correlation coefficients ranged from .93 to .99 with an average of .97. 
Тће high order of these coefficients was ascribed primarily to the thorough 
and detailed nature of the job descriptions and specifications. For the 
35 jobs, fluctuations among seven raters ranged from 9 to 106 points with 
an average of 37.9 points. The fluctuations among the seven sets of 
ratings was equal to the point value of 1.0 labor grade or less for 20 per 
cent of the jobs, and equal to the point value of 1.5 labor grades or less 
for 57.1 per cent of the jobs. 

3. Correlations between each factor of the standard manual and total 
score on the standard manual were computed from data submitted by 
three companies. Except for one factor, there was a high degree of 
similarity among the three correlations thus computed for each factor, 
suggesting high factor reliability. Negative correlations were obtained 
between total score on the standard manual and the factors, "unusual 
working conditions" and “dexterity,” indicating that salaried jobs with 
better working conditions and requiring less manual dexterity tend to be 
paid higher wages than salaried jobs with poor working conditions and 
requiring more manual dexterity. 

4. Intercorrelations among six different company job evaluation 
systems ranged from .89 to .97 with a mean of .94. These six systems 
included two factor comparison systems with 5 factors each, two point 
rating systems with 15 factors each, one point rating system with 13 
factors, and one ranking system. The results indicate a high degree of 
commonalty among different job evaluation systems. 

Received July 15, 1948. 
Early publication. 


References 


1. Ash, Р. The reliability of job evaluation rankings. J. appl. Psychol, 1948, 3% 
313-320. 


Job Evaluation Systems 475 


2. Benge, E. J., Burk, S. L. H., and Hay, E. N. Manual of job evaluation. New York: 
Harper & Brothers, 1941. 

3. Lawshe, C. H., Jr. Studies in job evaluation: II. The adequacy of abbreviated 
point ratings for hourly-paid jobs in three industrial plants. J. appl. Psychol., 

. 1945, 29, 177-184. 

4. ——, and Wilson, R. F. Studies in job evaluation. 6. The reliability of two point 
rating systems. J. appl. Psychol., 1947, 31, 355-365. 

5. Otis, J. L., and Leukart, R. H. Job evaluation. New York: Prentice-Hall, 1948. 


6. Viteles, M. S. A psychologist looks at job evaluation. Personnel, 1941, 17, 165- 
176. 


Industrial Noise and Hearing 


Robert B. Sleight and Joseph Tiffin 
Division of Education and Applied Psychology, Purdue University 


Industrial noise has increased steadily with growth in mechanization. 
As an undesirable feature of the industrial environment, noise today is of 
increasing conern to many groups. "There is some evidence that in the 
industrial situation noise is a contributing factor to inefficiency, fatigue, 
lowered morale, absenteeism, accidents, and labor turnover. Probably, 
however, the most incontestable evidence is that industrial noise con- 
tributes to deafness. It is the purpose of this article to summarize certain 
published studies on the relation of noise to impairment of hearing, and 
{о note the possible implications of the findings from the standpoint of 
compensation for the resultant disability. y 

Generally conceded to mean any unwanted sound, noise has been 
rather thoroughly studied. In order to study noise and its effect, it is 
necessary to measure it. Its accurate measurement, i.e., determination 
of its intensity and composition, is now possible. Intensity is most often 
measured by a sound-level meter. This is an instrument having a micro- 
phone which is placed in the noise field. The resultant electrical current 
output of the microphone is then indicated on a calibrated decibel scale. 
The intensity is usually expressed in terms of the decibel, a unit which 
may have its zero point standardized at approximately the least sound 
that can be heard by the “normal” ear. The composition of a noise 18 
essentially determined by an analyzer which indicates the intensity of the 
frequency components of the complex sound. Dependable noise measure- 
ment is not a simple process and usually requires the services of а skilled 
acoustic technician. 

Hearing, related as it is to noise and being an integral part of the 
health of the industrial worker, must be precisely and accurately meas" 
ured. Aninstrument designed to do this measuring of a person's auditory 
acuity is the audiometer. Important in making adequate hearing meas- 
urements, in addition to the audiometer, are a sound-proofed testing 
room, or at least a room in which the ambient noise is low and stable, and 
the services of suitably trained technicians. 

Noise and hearing measurement, although requiring some equipment: 
and trained personnel, is not, very expensive and may be highly desirable, 
especially in “noisy” industries. 

476 


Industrial Noise and Hearing ^ 477 


Noise and Production 


Several surveys and experiments, e.g. (27, 34, 46), have tended to 
substantiate the belief that noise contributes to inefficiency and reduced 
productivity. Although there is a dearth of dependable "before and 
after" production figures available, the bulk of extant evidence shows 
noise to be deleterious to production. The studies that have been made 
assume significance in the accumulation of faets which indicate that even 
workers who are accustomed to working under noisy conditions still show 
increase in production when noise is reduced. Kerr (25) and Poffen- 
berger (36), however, have reported that music may in some cases increase 
production. It appears, then, that some carefully controlled research is 
needed to ascertain the exact relation of noise to attention and this in 
turn to efficiency. 


Indirect Effect of Noise on Workers 


In addition to the direct effect of noise upon workers as evidenced by 
production, the indirect influence may be of “dollars and cents” value to 
industrial management. What is the possible contributory influence of 
noise to employee morale, absenteeism, labor turnover, accident rate, 
ete.? In some of the pertinent reports found in the literature (3, 6, 7, 40) 
there is extensive evidence that noise tends to influence these factors 
adversely. Most often reported in the employees’ responses to noise 
reduction is the increased “ease of talking” which, it might be readily 
agreed, improves the attitudinal outlook of the employee and increases 
his overall satisfaction. 


Noise and Deafness 


The most convincing argument for reduction of noise in industry is 
the effect of noise on hearing. The literature dealing with the relation 
of auditory stimulation by industrial noise to hearing extends back at 
least as far as Fosbrooke's article (15) Pathology and Treatment of 
Deafness, published in 1831, in which he called attention to the prevalent 
deafness in blacksmiths. A thorough résumé and discussion of the early 
literature on the influence of industrial noises, previous to 1914, can be 
found in an article by Gilbert (17). Another rather complete account 
bearing directly on the deafness problem is A Critical Review of Experi- 
ments on the Problem of Stimulation Deafness, published in 1935 by 
Kemp (24). In 1946, Berrien (2) published a general review, under the 
title of The Effects of Noise, covering such aspects as effects on produc- 
tion, influence on vital processes, adaptation to noise, etc., 

Recent Studies. Among the more recent studies on noise relevant to 

earing in industry is a rather comprehensive one reported by Gardner 


478 Robert В. Sleight and Joseph Tiffin 


ер 4096 8192 16384 
Fee 
BG as lc ETT 

„| ~ 

"pereo ou E ШЫ 

фиш пз СП: u-[- | 

е | 

PEM -1—1— 

КИЕ" 
Н || — 
DUE so oo г 


Fic. 1. Normal decrease in hearing acuity with increasing age. From Gardner 
(16): A = 10-19 years; B = 20-29 years; C = 30-39 years; D = = 40-49 years; and 
E = 50-59 years. 


(16) in which the hearing of shipyard workers was tested and the results 
compared with other groups. 

In Figure 1 is shown an audiogram (a graphic record of auditory 
sensitivity) of the normal decrease in hearing acuity with increasing аде. _ 


FREQUENCY 
16384 


o 
РА 
m 
с 
> 
o 
> 
o 
o 


LOSS IN DECIBELS 


Fic. 2. Average hearing loss of 296 shipyard workers according to age. From 
Gardner (16): А = 16-20 years (N = 28); B = 21-30 years (N = 92); C = 31-40 years | 
(N = 120); D = 41-50 years (N = 44); E = 51-65 years (N = 12). 


Industrial Noise and Hearing 479 


This variation of hearing with age is a natural phenomenon which should 
not be discounted when considering apparent hearing losses of adults. 
Gross discrepancies between the curves shown in Figure 1 and various 
subsequent audiograms indicate that some unnatural element is causative 
of hearing loss. 

Figure 2 displays the principal data obtained by Gardner (16) and 
may be compared with Figure 1 so that the magnitude of hearing loss, 
with attention given to age, will be apparent for those workers who 
represent the shipyard working group. 

Another study on the relationship between noise level of the working 
environment and deafness was made by Rosenblith (38). Because of the 


FREQUENCY 


128 256 52 1024 2048 4096  8!92 16384 


Fic. 3. Comparison between “normal” and industrial hearing loss. From Rosen- 
|“ (38): A=AWorld’s Fair Group, 40-49 years; and В = Boilermakers, average age 
years. 


high noise-level encountered in a boiler factory (about 110 decibels 
average) (1), the main data were gathered there. In Figures 3, 4, and 5 
these audiograms indicate clearly the damaging effect on hearing resulting 
from working in a noisy environment. Figure 3 specifically compares 
the hearing loss among boilermakers to a group of “normals” (as measured 
by the Bell Telephone Laboratories at the 1939 World's Fair. It is 
partieularly important to note that the groups shown are for all intents 
and purposes equated on the age factor. Figure 4 shows the degree of 
hearing loss of workers relative to their tenure on the job. Figure 5 
makes possible visualization by the reader of the relationship between the 
average noise level and the hearing losses. In addition, this figure shows: 


480 Robert B. Sleight and Joseph Tifin 


FREQUENCY 
128 256 512 1024 2048 4096 8192 16384 


Ра о T 7I 
ONE Sy 6 ПА ПО INN 
СКЕ ОТУ ОАО ОД ИШИ ШИЕ 
TEOT KOC iT ПРАБ ШИШ 
B 2 | 
E 


Fic. 4. Hearing loss among boilermakers according to time on the job. From 
Rosenblith (38): A — 15-20 years on job; and B — 20-25 years on job. 


' how the hearing loss is almost completely localized in the region above 
1000 cycles with the maximum loss occurring at about 6000 cycles. 
In addition to the data presented in these audiograms Rosenblith 
reported the following findings from a survey in which the same shipyard 
workers were tested before work (in the morning) and again after work (in 


FREQUENCY 
128 256 512 1024 2048 4096 8192 16384 


ЕТИ? етк T T Е 
а С Tr А 
ИЕ ДЕА ГАУ А а nb I] 
УИ ТЕГ па | 
ИИИ ЛИ TUN Es АШУ; А 
БИ КҮ TUN CN Ve ВЯ 
С Luo NN И | 
ЕЕЕ 
ТОЛДУ АЛАХ КЕТЕ Est] 


Ета. 5. Hearing loss among various categories of personnel after 15 years of em- 
ployment in a boiler factory. From Rosenblith (38): A = Machinists (average noise 
level 75db); B = Blacksmiths (average noise level S0db); and. C = Boilermakers 
(average noise level 90 db). 


+ 


Industrial Noise and Hearing . 481 


the evening) : 75% showed greater hearing loss in the evenings; 19% were 
the same morning and evening; and 6% were better in the evening. 
These percentages point out at least a temporary loss of hearing which 

© may be directly attributable to the high noise level found in this occupa- 
tion. 

MeCoy (32) reports on a study in which 100 preemployment audio- 
metric tests were made on men going into noisy occupations at a large 
shipyard. These individuals gave no past history of exposure to noise, 
or disease of ear, nose, or throat. It was found that after working 7 hours 
at the task of chipping (110 to 130 decibels) а representative group of 
men had a distinct loss in hearing. This decrement was also discernible 
the following day. Further, McCoy reports, after а period of one month * 
there was found to be a definite loss of hearing in the high frequencies 
which was not materially affected by a rest of one or two days. Examin- 
ation of chippers after a year or more of exposure to this noise revealed а 
similar though more extensive loss. 

In an earlier survey made by the Department of Labor in New York 
State (48) 1040 workers in several different noisey industries were tested 
audiometrically, and the highest incidence of deafness was found to be in 
those industries with the highest noise-level. In the group of workers 
exposed to the noise for less than a year it is reported in this study that 
only 6% appeared to have any hearing loss, while for those exposed to 
noise for 25 years or more, with no history of a possible causative disease, 
26.9%, were deaf in some form. It was inferred that these losses for the 
long tenure group were more than could be accounted for by the normal 
decrement with age. It was further reported that: “the greatest number 
of cases of deafening falls in the group between ages 21 and 30 years.” 
Whether this age group tendency would hold true for workers in many 
other industries is difficult to predict. 

Physiology of Deafness. Although extensively studied and much 
worthwhile information presented on it, the question of the exact organic 
effects of noise on the hearing mechanism is still not conclusively an- 
swered. This is chiefly due to a deficiency in our knowledge of the pro- 
cess of hearing. Good discussions of the hearing mechanism and various 
aspects of acoustic phenomena may be found in several sources, for 
example (8, 14, 33, 43). Р f 

There has been, however, confidence in the scientific foundation of 
deafness as is illustrated by Goldner’s comment (18) that: “The etiology 
of industrial deafness is on a sound pathologie and physiologie basis. 
The most important findings are degenerative changes in the outer hair 
cells of the organ of Corti, starting in the basal whorl of the cochlea.” 

The studies reviewed by Kemp (24) on stimulation deafness lead him 
to comment summarily: “People who work in extremely noisy environ- 


482 Robert B. Sleight and Joseph Tiffin 


ments are often found to be hard of hearing, particularly for high fre- 
quencies.” 

Stevens (44) indicates that the explanation of selective hearing loss, 
i.e., loss of acuity for specific frequency ranges instead of complete loss, 
can be made in terms of the construction of the hearing mechanism. He 
says: “High tones are localized near the basal end of the cochlea, 200 
c.p.s. is at the middle and the lower octaves are closely bunched toward 
the helicotrema.” 4 

This arrangement, wherein the high pitches stimulate the basal 
(outermost) portion and the low pitches stimulate the apical (innermost) 
portion, makes it logical to conclude that the common and primary oc- 
currence of high tone hearing loss is due to the vulnerability of the high 
tone receptors. 

Auditory Fatigue. Probably every reader of this paper. has experi- 
enced some degree of hearing loss of a temporary nature. Some decrease 
in auditory acuity is of such a brief duration that it may be more correctly 
called auditory fatigue than deafness. Rawdon-Smith (37) and Ewing 
and Littler (12) studied this fatigue of the hearing mechanism and 
point out, among other things, that the human threshold of fatigue is 
definitely lower in terms of intensity than is the threshold of feeling. 
This is of significance because while an individual does not think that a 
noise is loud enough to harm him it may unsuspectedly be diminishing 
his ability to hear. Fleming (13) emphasizes that “prolonged exposure 
to loud noise may cause permanent deafness . . . while less noise fre- 
quently causes temporary loss of hearing." Davis (10) after subjecting 
19 men to high intensities of noise resembling airplane noise for several 
days reported that: "temporary impairment of hearing was regularly 
produced, but there was no evidence of cumulative injurious effects." 

Harmful Noise Levels. Davis (8) says the answer is “уез” to the 
question, “Will the temporary hearing loss produced day after day and 
week after week ultimately become permanent?", if the noise is loud 
enough and if а loud exposure is repeated often enough. Davis adds, 
“We do not know how loud . . . is loud enough" to injure permanently. 
Hence, it is important to survey noisy indistries and occupations to 
ascertain the levels which exist, so as to adequately protect the workers 
against ill effects. There seems to be great variability among individuals 
in their susceptibility to injury from noise as indicated in some reports 
(9,18). However, in spite of this variable resistance it should be possible 
to grant a good margin of safety so that we can be confident that no 
workers will be affected adversely. 

Various researchers have suggested different levels of noise above 
which there is danger. Schweisheimer (41) considers that the “hazard 


Industrial Noise and Hearing 483 


level” falls between 80 and 90 decibels. McCord and Goodell (29) have 
suggested that a level of “80 to 85 decibels will cause some defects of 
hearing in the high frequency zones after a period of years.” McCord 
and Goodell (29), and McCoy (32) concur that levels over 100 decibels 
are a first concern. Davis (8) states that “noise of less than 100 decibels 
may reasonably be considered quite safe except perhaps for a few un- 
usually susceptible individuals.” Rosenblith (38) admits difficulty in 
specifying a danger level, but believes “as little as 75 or 85 decibels, if 
sufficiently prolonged, will suffice to bring about permature aging of the 
ear.’ Goldner (18) states that “А consensus seems to indicate that the 
minimum safe level is in the neighborhood of 80 decibels.” 
In the light of the preceding opinions expressed on noise intensities, 
- based on experience and experimental evidence, it appears from the list 
of noise levels shown in Table I measured at a distance of three feet from 
the machines, that operators are subjected to literally deafening in- 
tensities. 
Sabine and Wilson (39) report that: “Noise level measurements were 
made in 33 separate plants covering a wide diversity of industries and 


Table 1 
Noise Levels* of Several Machines (From Sabine and Wilson (39)) 


*Headers......... eee cen rrr tene rnnt 


Bumping hammers 
Hydraulic ргевв............... 
Automatic riveters 
Lathes (average)....... n 
Automatic screw machines 
Airplane propeller grinding 
Cotton spinning. .... eee mtn 


* Information is not available on the frequency components of these noise levels. 

** Measured at a distance of three feet from the machines. The opinion has been 
expressed that greatest functional utility of the noise-level measurements would accrue 
from placing the microphone of the noise-level instrument at the customary position of 
the ear of the worker engaged in operating the machine. (Dr. M. D. Steer, Voice Science 
Laboratory, Purdue University.) . 


484 Robert B. Sleight and Joseph Tiffin 


machine operations. Of all the readings taken in actual work areas, the 
highest was 130 db. and the lowest 65 db. In the majority of cases, the 
observed noise level ranged quite uniformly between approximately 85 
and 105 db." 

McCoy (32) stresses an important point when he suggests that the 
opinion of the worker may be unreliable in judging whether a noise й 
great enough to be harmful, because pain occurs at a higher level thas 

„ that at which noise is actually harmful to hearing. 

From the standpoint of expediency in initiating noise control im 
industry the advice of McCord and Goodell (29) merits consideration. 
They advise: “In estimating а hazardous noise level in industry one - 
should, from a practical point of view, use a threshold which will not | 
provide too unwieldy a group requiring preventive treatment so that. 
management will be more likely to take immediate steps.” 


Compensation for Hearing Loss 


Gardner (16) has asked the question in connection with the hearing 
loss demonstrated in the audiogram of Figure 2: “What is to prevent _ 
them [these and similar workers with impaired hearing] from proving 
their claims in compensation cases?” Bunch (4) reporting on compensa- 
tion for hearing disability in 1942 stated that: Ъ 

“Cases of this kind do appear before courts, and compensation boards — 
are making awards for hearing losses. 'Тһе amount of the award often 
appears to depend not on the amount of damage to hearing but on the 
relative technical skills of the representatives of the interested parties. 
А glance through the decisions from different states shows a lack of uni- 
formity. Only a few states have adopted definite schedules. Insurance 
companies have worked out no general schedule of awards for acquired 
hearing defects comparable with that used for acquired visual defects.” 

That attention is being paid to development of valid and scientific — 
bases for evaluation of hearing loss is illustrated in a comprehensive - 
review by Carter (5) covering eleven methods for determining and те 
cording hearing loss. In 1938 the Consultants on Audiometers and — 
Hearing Aids of the Council on Physical Medicine of the American | 

. Medical Association, realizing the lack of uniformity of methods for 5. 
evaluating the percentage loss of hearing for speech, began formulation. 
of a usable technique. Their work resulted in publication in 1947 оак 
Tentative Standard Procedure for Evaluating the Percentage Loss of | 
Hearing in Medicolegal Cases (50). A 4 

Although this group has done commendable work in formulating this — 
tentative schedule, one might make the criticism that it adjudges only | 
“Joss of hearing for speech." Admittedly loss for the hearing of s | 


Industrial Noise and Hearing 485 


is the important item, but it does seem that often compensation for loss 
in high and low frequencies should probably be granted if it is construed 
to be a disability, as it may very well be, e.g., the worker suffering high 
frequency loss (the most common type) is occasionally endangered and 
often inconvenienced by not being able to hear the shrill tones of some 
warning devices. Ы 

Dr. W. E. Grove (19), chairman of the sub-committee оп noise of the 
Committee on Conservation of Hearing of the American Academy of 
Ophthalmology and Otolaryngology, has recently reported initiation of 
careful, exact, scientific examination of the hearing of employees in noisy 
industries. This work aims at actual "in plant” study, particularly, in 
an effort to discover characteristics of those employees whose hearing 
mechanism may be “noise-susceptible.” The work of this group can be 
expected to throw new light on the noise problem in industry. 

Diagnosis of Occupational Deafness. Goldner (18) indicates that the 
correct diagnosis of occupational deafness is especially important as 
cases are brought before compensation boards. He finds it worthwhile 
to describe under the heading of deafness due to noise and explosive 
sounds, two types of deafness, viz.: (1) acule (easily observable and due 
to explosive noises of extreme intensity) and (2) chronic (of most import- 
ance to the medical profession, industry and labor . . . “deafness is 
insidious in onset and the intensity of the noise is below the level of pain- 
ful stimulation"). Goldner'states that the technique for diagnosis of 
the chronic type is: 

* . . based on a history of exposure to noise capable of producing 
injury to the organs of hearing; а normal ear drum; an elevated threshold 
for high tones, as manifested on the audiogram and with tuning fork or 

tests with Galton’s whistle; diminished bone conduction, as shown by a 
positive Rinne and a positive Schwabach test; normal vestibular re- 
sponses; patent eustachian tubes; differentiation from other lesions 
capable of producing a similar clinical picture.” 

Perlman (38) emphasizes the difficulty of determining cause and effect 
relationship with respect to auditory loss. He gives the following list of 
the relevant factors influencing the degree of the loss: (1) total time of 
exposure; (2) length of each exposure; (3) loudness of sound stimulus; 
(4) age of subject; (5) constitutional factors; (6) character of sound, | 
constant or sharp; (7) use of protective devices; (8) exposure in closed 
or open spacies; (9) previous aural disease; and (10) frequency of 
stimulus. 

“The problem of determining how much damage of the hearing is due 
to the accompanying pathologie process and how much is due to industrial 
Noise is a difficult one; each case must be judged individually” (18). 


486 Robert B. Sleight and Joseph Tifin 


Eliminating or Controlling Noise in Industry 

Among the ways of eliminating or at least reducing noise in 
are: acoustical treatment, isolation of the source, reduction of vibration, 
maintenance and substitute methods, and equipment and plant design. 

Lindahl (28) reports that acoustical treatment will yield only 6 to 
decibel reduction, but he goes on to point out that this amount of change 
may result in a decrease in loudness, according to the average worker, of 
anywhere from 30 to 70 per cent which from an attitudinal standpoint, 
least, is considerable. A thorough discussion of sound absorption is 
by Sabine and Wilson (39). A report of an “їп plant” experiment on 
advantages of acoustical treatment is given by Berrien and Young. (3) 

Isolation of noise sources has been the only solution to the noise 
problem in some plants such as those having engine test stands. 
bration reduction by use of cushioning, rubber and felt mountings for 
equipment, and even floor suspension has decreased the noise problem in. 

many enterprises. It has been inferred (28) that when the designer 
to reduce vibration to preclude noise he finds very often that he has 
achieved efficiency. 4 

Quite obvious is the effect of properly maintaining equipment in 
order to keep its free from rattles, squeaks, etc., which tend to wear both. 
on the human and on the machine. Substitution of quiet operations 
such as welding for riveting may materially reduce noise levels in those 
operations where they are feasible. : 

Forethought given to machine and plant design may solve noise 
problems before they occur. Johnston (23) suggests that a 5 to 10 degree 
outward slant of walls deflects sound into sound absorbent materials і 
attached to the ceiling, which should be as low as is practical in order to 
bring the absorbent material close to the noise source. 

In addition to those already mentioned many valuable suggestions on 
ways to control noise in industry may be found in the published literature, 
some of which are the following: (1, 7, 13, 22, 28, 30, 39, 47, 49). А 

In the event of the impossibility of eliminating а noise hazard, as 
in the case of explosions in some occupations, the use of ear plugs ог 
stoppers is an alternate procedure. McCoy (31) states that: “Workers 
should be safeguarded against the developing cushion of deafness and 
urged to wear [ear] protection for health." Other researchers (8, 11, 26, 
45) have also recommended various types of ear defenders, from a wad 
of cotton to the V51-R “Еаг Warden." 4 

The “Ear Warden" ear plug was devised in the Psycho-Acoustic 
Laboratory at Harvard University under the auspices of the National. 
Research Council. It is moulded of soft black neoprene, a synthetic 


Industrial Noise and Hearing 487 


rubber. It was found to be practical on the basis of ease of insertion, 
cleanliness, durability, low cost, feasibility of mass production, and 
comfort to the wearer. Extensive testing demonstrated the efficiency 
of the plug. (45) 

Goldner (18) however, points out a disadvantage of present ear pro- 
tection when he says: “most workers are reluctant to submit to any 
effective dampening of normal hearing impulses,” because of the obvious 
interference with detection of warning sounds, either vocal or from ma- 
chines. He believes that possibly some sort of sound filter can be devised 
to filter out harmful sound intensities in high or low frequencies and still 
allow perception of speech. 

Other objections (21) to ear plugs are that they cause pressure to 
build up in the ears and make the ears feel stuffy; they trap moisture in 
the ear and promote growth of fungi; and further, they may carry in- 
fectious material into the ear. The use of external coverings of the ear 
(ear-caps) has been suggested as-a possible solution for exclusion of noise 
to the ear. They would avoid the difficulties mentioned for the ear-plugs 
and would be expecially advantageous in that an easy check could be 
made on whether or not ear protection was actually being used. 

In general, if ear protectors of any type are а real safeguard to the 
worker's welfare it would seem that managements insistence on em- 
ployees wearing them is as justified as regulations concerning the use of 
safety eye-wear to prevent damage to the eyes. 


Summary 


From the standpoint of both management and labor there are many 
advantages to be gained from control of noise within industry. Complete 
condemnation of noise in industry may be unwarranted in view of the 
possible exhilarating influences of some sounds; also, some experimenta- 
tion (42) suggests that the harmful effect of noise has been overem- 
phasized. However, the greater quantity of the experimental evidence 
indicates that there are many circumstances wherein noise is deleterious. 
Especially significant are the deafening effects on the worker in noisy 
industries. The authors of this article agree with Gruss (20) who con- 
cludes, “The great and overwhelming weight of authority is that: (1) 
hearing is impaired by industrial noise; (2) groups subjected to the 
loudest noises are most affected. . . .” 

The immediate action on the part of the employer who may be con- 
fronted by the noise problem would appear to be: 


488 Robert B. Sleight and Joseph Tiffin 


(1) noise measurement, in terms of intensity levels and composition; 
(2) institution of feasible noise elimination or reduction measures; 
(3) establishment of some form of hearing testing program for 
workers engaged in noisy occupations, the first step of which would 
be preemployment audiometric testing of those employees who 
are to work in “harmful” noise levels. у 
Received July 1, 1948. 
Early publication. 


References 


1. Allen, A. Н. Reduce noise in steel foundry cleaning room; Monroe Steel Casting Co. 
Foundry, 1941, 69, 60-61, 143. 

2. Berrien, F. К. The effects of noise. Psychol. Bull., 1946, 43, 141-161. 

9. Berrien, F. K., and Young, C. W. Effects of acoustical treatment in industrial 
areas. J. acoust. Soc. Amer., 1946, 18, 453-457. 

4. Bunch, C. C. Conservation of hearing in industry. J. Amer. med. Ass., 1942, 
118, 588-593. i 

5. Carter, H. A. Review of methods used for estimation of percentage loss of hearing. 
Laryng., 1942, 52, 879-890. 

6. Cunningham, E. A. Psychological aspects in the treatment of industrial injuries. 
Industr, Med ., 1944, 13, 119-120. 

T. ken A. Н. Some aspects of the problem of noise Occup. Psychol., 1938, 12, 

8. Davis, Н. (Ed). Hearing and deafness. New York: Murray Hill, 1947. 

9. Davis, H., Derbyshire, A. J., Kemp, E. H., Lurie, G. H., and Upton, M. Experi- 
mental stimulation deafness. Science, 1935, 81, 101-102. 

10. Davis, H., Morgan, C. T., Hawkins, J. E., Galambos, R., and Smith, F. W. Temp- 
orary deafness following exposure to loud tones and noise. Committee on 
Medical Research of the O. S. R. D. Boston: Dept. of Physiology and Dept. of 
е and Laryngology, Harvard Medical School, Sept. 30, 1943. Pp. 71, 

pendix, 

11. Dickson, E. D. D., and Ewing, A. W. G. The protection of hearing. J. Laryng., 
1941, 56, 225-242. 

12. Ewing, A. W. G., and Littler, T. 8. Auditory fatigue and adaptation. Brit. J. 
Psychol., 1935, 25, 284. 

13. Fleming, N. Noise and its prevention. J. Text. Inst., Manchr., 1939, 30, 261-271. 

14. Fletcher, H. Speech and hearing. New York: Van Nostrand, 1929. 

15. Fosbrooke, J. Pathology and treatment of deafness. Lancet, 1830-1831, 1, 645. 

16. Gardner, W. H. Injuries to hearing in industry. Industr. Med., 1944, 13, 676-679. 

17. Gilbert, D. J. Influence of industrial noises. J. industr. Hyg., 1921, 3, 264-275. 

18. Goldner, A. Occupational deafness. Arch, Otolaryng., 1945, 42, 407-411. 

19. Grove, W. E. (Personal communication.) November 4, 1947. 

20. Gruss, L. Effect of noise on the hearing of industrial workers. Volta Rev., 1939, 
41, 511-514, 535. 

21. Guild, S. R. (Personal communication. November, 1947. 

22. Hodge, W. J. Sounds control and noise elimination. Person. J., 1936, 15, 11-18. 

23. Johnston, J. M. Noise reduction in mills meets workers’ approval. Text. World, 
1944, 94, 105-107. 


Industrial Noise and Hearing i 489 


2. Kemp, E. H. A critical review of experiments on the problem of stimulation deaf- 


ness. Psychol. Bull., 1935, 32, 325-342. 


25. Kerr, W. A. Experiments on the effects of music on factory production. Appl. 


B 88 N 


~ 
= 


S 8 9 8 BH" Е БНР 


5 5 È BS 


. How industry battles noise to win 


Psychol. Monogr., 1945, No. 5, 40. 

Knudsen, V.O. Defense against noise: science seeks practical methods of protecting 
the ears against the injurious noises of a mechanized age. Nat. Safety News, 
1942(2), 45, 28-29. 

Laird, D. A. Influence of noise on production and fatigue, as related to pitch, 
sensation level and steadiness of the noise. J. appl. Psychol., 1933, 17, 320-330. 

Lindahl, R. Noise in industry. Industry. Med., 1938, 7, 664-669. 

McCord, С. P., and Goodell, J. D. Abatement of noise. J. Amer. med. Ass., 1943, 
123, 476-480. 

McCord, C. P., Teal, E. E., and Witheridge, W. N. Noise and its effect on human 
beings. J. Amer. med. Ass., 1938, 110, 1553-1560. 

MeCoy, D. А. Control of industrial noise. Safety, 1944, 31, 280, 293. 

McCoy, D. А. The industrial noise hazard. Arch. Orolaryng., 1044, 39, 327-330. 

Morgan, C. T. Physiological psychology. New York: MeGraw-Hill, 1943. Pp. 
219-254. 

Obata, J., Morita, S., Hirose, D., and Motsumoto, H. Effects of noise on human 
efficiency. J. acoust. Soc. Amer., 1934, 5, 255-261. - 

Perlman, H. B. Acoustic trauma in man, clinical and experimentalstudies. Arch. 
Otolaryng., 1941, 34, 424-452. 

Poffenberger, A. T. Principles of applied psychology. New York: Appleton- 
Century, 1942. Pp. 141-143. 

Rawdon-Smith, A. F. Experimental deafness: further data upon the phenomenon 
of so-called auditory fatigue. Brit. J. Psychol., 1936, 26, 233-244. 

Rosenblith, W. A. Industrial noises and industrial deafness. J. acoust. Soc. Amer., 
1942, 13, 220-225. 

Sabiné, Н. J., and Wilson, В. А. The application of sound absorption to factory 
noise problems. J. acoust. Soc, Amer., 1943, 15, 27-31. 

Sabine, P. E. The problem of industrial noise. Amer. J. publ. Huh., 1044, 34, 
265-270. 


. Schweisheimer, W. Effects of noise in the textile industry. Rayon Text. Mo., 


1945, 26, 593. 

Stevens, S. S. The science of noise. Atlant. Mon., 1946, 178, 96-111. 

Stevens, S. S., and Davis, H. Hearing, its psychology and physiology. New York: 
Wiley, 1947. у 

Stevens, S. 5., Davis, H., and Lurie, M. H. The localization of pitch perception on 
the basilar membrane. J. gen. Psychol., 1985, 13, 297-315. 

Weiss, J. A. Deafness due to acoustic trauma in warfare. Ann. Otol., etc., St.» 

Louis, 1947, 56, 175-186. 

Weston, Н. C., and Adams, S. The effects of noise on the performance of weavers. 
Indus. Hlth. Res. Bd., 1932, Rep. No. 65, 38-62. 

Wyder, С. G. Noise in industry: primer of cause, effect, and cure. Factory Mgt. 
and Maintenance, 1944, 102, 137-156. A 

Effect of noise on hearin of industrial rerit Eig hs Women in Industry. 
N. Y. (State) Dept. Lab., Spec. Bull. No. 166, Sept. 1930. 

(State) Dept, Paa ein production. (Anon.) Modern Ind., 1943, 6, 
eo D D * 

T'entative standard procedure for evaluating the percentage loss of hearing in medicolegal 
cases. Council on Physical Medicine. J. Amer. med. Ass., 1947, 133, 396-397. 


Application of Addends to Sales and Clerical 
Occupational Classification * 


Bernard M. Bass 
The Ohio State University 


Addends offer à means by which, for a given number of factors and 
grades within factors, all their possible patterns can be meaningfully 
coded. Little known, the use of addend coding has been limited mainly 
to the coding of data to facilitate large-scale statistical research. Yet, as 
Тоорв (9, 10) has suggested, addends may have widespread application 
in biology, sociology, business, and applied psychology. 

Following Viteles' (11) original work on the job psychograph, several 
techniques have been developed which provide for the classification of 
occupations according to patterns of factors or characteristics and which 
allow for matching of the individual's combination of characteristics with 
occupational requirements in the same set of terms.! These include the 
Minnesota Occupational Rating Scales (6), Michelman's (5) method for 
classifying occupations according to the demands made on the personal 
characteristics. of workers, the Entry Occupational Classification (14) 
and the Occupational Aptitude Patterns (3) used in conjunction with the 
new USES General Aptitude Test Battery. 

In the study to be presented, an addend classification was developed 
for 781 sales and clerical occupations to indicate the possibilities of apply- 
ing addends to occupational classification. "This attempt was mainly 
intended as an experiment in the use of addends and was not necessarily 
directed towards proving or disproving its value as compared with other 
existing occupational classification techniques. At present, however, the 
results obtained have been set up as an experimental counseling aid and 
distributed to the various vocational and educational counseling agencies 
at Ohio State University. Evaluation of the addend classification as а 
new tool for vocational guidance will be forthcoming. The value of 


* The author is indebted to Dr. Carroll L. Shartle for his valuable guidance in helping 
carry through this study; to Dr. Frank Fletcher for his constructive criticism and pres- 
entation of the original problem; to Dr. Herbert Toops for some original ideas concerning 
the addend technique, and to the United States Employment Service for making avail- 
able the data used. 

1 For further discussion of the implications involved in a truly functional classifica- 
tion of occupations, the reader should refer to Alexander, W. P. Research in guidance: 
а theroretical basis. Occupations, 1934, 12, 75-91. 


490 


Addends in Occupational Classification 491 


addend classifications as a personnel technique for business and industry 
remains to be investigated. At present, one can only speculate as to the 
potentialities of addend classifications of oceupations. 

Definitions of some of the most frequently used terms in this report 
are as follows: 

Element. An original job or worker characteristic used by the Occupational 
Analysis Section of the USES in gathering and processing the information used 
in this study. 

Factor. An element or & group of elements used in construction of the 
patterns upon which the classification system is based. m 

Category. The grades which exist within a factor denoting a subdivision 
or whether or not the factor is required by a given occupation. С 

Addend. Ап integer assigned to each s eti of а factor according to & 
preconceived numerical system determined by formula. _ ^ i 

Addend code. The arithmetical sum of addends which will describe both 
the factors and the categories of the factors for any given occupation. It is 
the numerical sum which represents the pattern of requirements of а given 
occupation. } М 

MET code. The three-character code which describes the machine used, 
experience required by, and the training necessary to reach maximum produc-_ 
tion for a given occupation. е А 

Residual code. The code of those factors which appear infrequently. The 
letters representing the residual factors are included in the total code when the 
factors are required by a given occupation. - " 

Totalcode. The combination of addend, MET and residual codes. It will 
describe seventeen variables and their subdivisions. 

For definitions of the various elements see (15), (16) and (17). For 
definitions of other terms pertinent to addend coding see (10). Г 

The information concerning the 781 sales and clerical occupations 
(which cover 44 industries) classified in this study came from the extensive 
research done by the OAS of the USES since 1934 on approximately 9000 
occupations. Estimates by job analysts of the worker characteristics 
required for success were collected, summarized and placed on speed sort 
cards. Hiring requirements were the minimums given by employers. 
А card representing each occupation contained all the gathered informa- 
tion on approximately eighty variables. IE 

In terms of the scale of ratings used by the analysts, each characteristic 
punched on a card indicated that either a very high degree of the char- 
acteristic was required in some phase of the job or an above average 
degree was required, either in numerous tasks of the job or in the major 
or most skilled task. For the purposes of this present investigation, 
Since “very high" could not be differentiated from “ађоуе average," the 
expedient phrase, “high degree,” which would suffice to cover both, was 
chosen to describe the amount of a given characteristic either required or 

‘not required by an occupation. A characteristic was considered to be 
significantly required to a high degree, or not at all. Since the use of 
critical scores to dichotomize measured characteristics is a common 


492 : Bernard M. Ваза 


practice, thé dichotomization used here for estimates should not present 
too great a semantic problem. 

In its present form, the occupational information compiled by the 
USES lends itself readily to analysis for construction of occupational 
families. These families may aid the counseling of the unemployed and 
those interested in changing from one occupation to another. However, 
they are, as yet, unsuitable for use by vocational interviewers for more 
generalized problems. It was the intention in this study to develop a 
method for transforming this information into a system which might 
make wanted information on occupations ready for practical use in the 
vocational guidance and employment situations. By coding seven vari- 
ables with one numerical system and giving information on ten other 
variables, the classification proposed here may enable the matching of 
individuals’ patterns with occupational patterns to be done quickly, 
efficiently and easily. A need was recognized for a high degree of flexi- 
bility in occupational classifications and allowance for it was incorporated 
into the addend coding structure. In addition, with the same data used 
to construct the many Job Family Series, this investigation attempted 
the development of a method for constructing a “universal” job family 
as an integral part of the overall addend classification. 


The Method 


The first operation performed in this present study was to organize 
into seventeen factors the 81 elements included on the USES speed sort 
cards. Jaspen (4) through factorial analysis had found several factors 
which he named strength, intelligence, inspection, physically unpleasant 
working conditions, mechanical dexterity and mechanical information. 
It was realized that Jaspen had sampled only skilled, semi-skilled and un- 
skilled industrial occupations and that a factor analysis of sales and 
clerical occupations might be desirable to insure adaquacy of the factors 
to be used in classification. However, at the time of this study, such an 
analysis was not available.? 

After careful scrutiny of the 781 occupations and following Jaspen’s 
findings when advisable, seven major and ten minor factors were arrived 
at a priori. The seven major factors, the ones deemed most frequent, 


? Subsequent to this study, Williams (12) completed a factor analysis of the 781 oc- 
cupations and partly substantiated the major factors selected a priori. The factors he 
found were manual dexterity, intelligence, social intelligence, mental ability, and accu- 
racy. Arithmetic computation fell under the factor of accuracy and mental ability was 
split into two factors, a higher order intelligence and a mental ability for rote learning. 
This differentiation was partly taken into account in the addend classification by the 
educational factor. It also seems to have been profitable to separate arithmetic com- 
putation and fine accuracy. 


Addends in Occupational Classification 493 


significant and crucial were sex, education, mental ability, manual dex- 
terity, social intelligence, fine accuracy (accuracy in checking, transcrib- 
ing, typing, filling orders, etc.) and arithmetic computational ability. 
These were to be the multiple bases for the addend coding structure. As 
will be seen later, provision was also made for the ten other factors which 
for one reason or another were not included in the addend part of the 
total code. . 

It was apparent that finding the proper occupation or occupations 
for an individual (ie., matching the individual and the occupation) 
would first involve finding the occupation with the highest possible 
pattern of requirements or factors which the individual both could and 
would fulfill. To give the classification greater value through greater 
flexibility, it would be necessary to be able also to arrive at occupations 
other than those with the pattern of the maximum requirements which 
the individual could and would fulfill. 

To classify occupations according to their patterns of requirements 
and to accomplish the previously mentioned objectives, addends were 
used to develop the major part of the classification coding system pre- 
sented in this study. Toops (9), in 1935, described a method which he 
stated “would be a practical means of coding human test profiles, and 
would enable all scientific classification systems to become more generally 
useful.” This, the addend technique, was а means of numerically coding 
all possible profiles that might exist in m traits, either qualitative or . 
quantitative. All the permutations of profiles possible among т traits, 
taking the traits n at a time, could be meaningfully coded. 

Statistical control was acquired by the development of a formula 
applicable to addends where the number of categories per trait was a 
constant. The formula Toops (9, p. 2181) presented was: 


А = п + 171, 


where А = addend appropriate to a given category of a trait in question. 
N — number of categories per trait (a constant). 
^ - position of the category compared with others for the same 
trait. 
m = the number of the trait. 


Since the information used in this present study of oceupations would 
not permit a constant number of categories, a more generalized formula 
Which would yield addends when the number of categories varied, was 
necessary. During the course of this investigation such а generalize 
formula was found which would apply to all coding situations. It is as 
follows: 


1 
А = (n—1) [Neo Nen vee Nit x 


404 Bernard M. Bass 


where А = addend appropriate to a given category of a factor in question. 
N = number of categories for a given factor (variable): 
n = position of the category compared with others for the same 
factor. 
m = the number of the factor. 


When considering the first factor, or when m = 1, then ES z1; 


when considering the rest of the factors, or when m is greater than 1, then, 
E = 0, and drops out of the equation. The term N(m—1)N(m-2) **: Ni 


begins with the number of categories in the factor preceding the one under 
consideration, and is the product of the number of categories in each of 
the preceding factors. As there is no factor, hence no categories preceding 
factor No. 1, when finding the addends for factor No. 1, the term is equal 
to 0. 

Sex, which had three categories, male required, either sex required and 
female required was chosen as factor No.1. Substituting in the general- 
ized formula, the addends appropriate to the three categories were 0, 1 
and 2 respectively. Sex was placed as the first factor of the classification 
in order to introduce a high degree of flexibility. If no suitable occupa- 
tion were available for a male with any given pattern, addition of 1 to 
his addend code would yield all the occupations with job patterns the 
same as his except that either sex could qualify for the occupation. 
Similarly, subtracting 1 from any female’s code would do likewise. 

Table 1 presents the addends appropriate to each of the categories of 
each of the factors of the addend system used to classify the 781 sales and 
clerical occupations. 

If one or more elements were required to a high degree by an ос- 
cupation, then the factor under which the element was grouped was соп- 
sidered as required to a high degree by the occupation in question. Cod- 
ing of occupations required simple addition of the correct addend for each 
factor but only one addend for each factor could be added. The decoding 
of an addend code, whenever necessary, is a process of simple subtraction. 
The method is to subtract successively, the largest addends of the system, 
which are still contained in the successive remainders of the addend code. 
The addends subtracted identify the original categories of the factors 
which are required by the occupation so addend coded. 

The total number of addend codes possible was 384, running from 0 
through 383. 

An example of an addend code is 66, which indicates that the occupa- 

tion with such a code number requires a male (addend 0), high school 


Addends in Occupational Classification 495 


Table 1 
The Addend System for Clerical and Sales Occupations 
Position of the Categories and Their Addends 


n= 1 2 3 4 
m JobFacto А A А А N 


—_________________________- ин 
1. Вех 0 Male 1 Either 2 Female 3 
2. Education 0 Less than з (Sth-12:p 6 (12-16th) 9 (16th) 4 


grammar 

3. Mental 0 Not 12 Required 2 
Ability required 

4. Manual 0 Not 24 Required 2 
Dexterity required 

5. Social 0 Not 48 Required 2 
Intelligence required 

6. Fine 0 Not 96 Required 2 
Accuracy required 

7. Arithmetic 0 Not 192 Required 2 
Computation required 


Шул тео. зс SO hs — аас 


graduate (addend 6), with a high degree of mental ability (addend 12), 
and a high degree of social intelligence (addend 48). 

The MET code which follows after the addend code was in the 
typical symbol form. The first letter of the MET code represented the 
iype of machine or machines fed, tended or operated by the occupation 
во coded. Usually, the letter code used was the initial of the machine's 
name. Thus A is the code for adding machine, T for typewriter and N 
indieates that no machine is used. 

The next symbol of the MET code, a digit, indieated the type of 
experience required by the occupation so coded. None, same, similar or 
other experience or some combination of the latter three were coded by 
the use of one of nine digits. 

The last symbol, a digit, indicated the amount of training in terms of 
time required by the oceupation so coded to reach maximum produetion. 
This varied from less than a day's training to more than six month's 
training and was represented by 0, 2, 4, 6, and 8. 

An example of the MET code is A28 which indicates that an adding 
machine is operated, similar previous experience is required and more than 
Six months’ training is needed to reach maximum production in the oc- 
cupation so coded. Д 
ү Why were not the machines, experience and training factors included 
in the addend code instead of being set up аз а separate part of the total 
. code? The main reason is that none of these three factors could fit into 
4 counselee-occupation matching procedure. It should be obvious that 


496 Bernard M. Bass 


experience could not be considered without particular reference to a 
given occupation. Thus, the occupation must be established by the 
addend code before experience can be discussed by the counselor with the 
counselee. Similarly, "training required" could in no way be discussed 
with the counselee until the occupation was found with the training re- 
quirements to be considered. Then, the question would be whether the 
-counselee was willing to undergo the training required. 

The residual code represented those factors which occurred infre- 
quently enough to be coded in straight symbol fashion. If the residual 
factor was required to a high degree by a given occupation, a letter rep- 
resenting that factor was included in the occupation’s total code. Other- 
wise, the letter did not appear. There were seven residual factors with 
code letters as follows: С = Concentrate Amid Distractions, K = Special 
Knowledge Required, P = Perceptual or Estimating Ability, R = 
Repetitive Job, 8 = Strength Required, U = Unpleasant or Hazardous 
Conditions, X = Seasonal Job. 

An example of a total code, with its addend, MET and residual 
components is 60-A28-CP. From this code, the reader can now describe 
which of seventeen factors and their various subdivisions are required by 
an occupation with such a code number. 


Results 


The 781 clerical and sales occupations were coded according to their 
requirements as explained in the previous sections and listed according to 
the numerical order of their respective codes. Because of space limita- 
_ tions, only a sample of the list can be presented in this report. For the 

complete list refer to Bass (1). > 

Included with the list of coded occupations are examples of work per- 
formed by arbitrarily grouped families of occupations. Each group of 
12 addends beginning with 0 are a family of occupations which vary only 
in sex and educational requirements. The work performed is arbitrarily 
defined as all the action verbs present in more than 10% of the occuaptions 
of a given family. For example, the 12 addend codes from 0 through 11 
are codes of occupations which only have sex and educational require- 
ments. The work done by this family of occupations according to the 
action verbs found in more than 10% of the occupations of the family 
includes checking, filing, sorting and typing. 

Much interesting occupational information was observed by inspection 
of the addend coded lists. The generalizations which follow were set 
down as inferences arrived at through observation and inspection and 
should be verified by statistical treatment. For the purpose of economy, 


| 


Addends in Occupational Classification 497 


Table 2 ; 
Sample of the List of Addend Coded Occupations* 


Total DOT 
Code Code Occupational Title and Industry 
0-N00 1-23.14 Messenger (clerical) 
15-N04-R 1-18.66 Checker (iron and steel) 
31-T00-R 1-87.34 Clerk, typist (insurance) 
48-N02 1-23.14 Delivery boy (garment) 
66-N32 1-55.10 Soliciter (business services) 
100-N04-X 1-17.01 File clerk (clerical) 
114-N00-K 1-38.01 Stock clerk, aircraft shop (air transportation) 
126-T22-R 1-37.34 Securities index clerk (banking) 
140-T02 1-37.12 Stenographer (clerical) 
220-A06-R 1-05.01 Sales tax clerk (clerical) 
261-N40 1-65.04 Municipal bond buyer (financial) 
305-N06-KP 1-18.66 Production weigher (textile) 
328-N02 1-26.03 Timekeeper (clerical) 
352-N76-R 1-03.02 Receive clerk (financial) 
381-A06-KP 1-48.24 Field contact man (dairy products) 


* Copy of the addend list of occupations with directions for its use can be obtained 
from the author, c/o Occupational Opportunities Service, Ohio State University. 


whenever a factor is mentioned in the following discussion, a high degree 
of the factor will be implied but not repeatedly stated as such. 
Some of the conclusions drawn are as follows: 4 


1. Most of the occupations with no requirements other than sex or 
education, are straight repetitive occupations which involve little re- 
sponsibility and few decisions. Only as education requirements increase р 
is much training or experience necessary. Many occupations in this 
family are entry occupations. 

2. Social intelligence is the major dichotomizing factor between sales 
and clerical occupations. While clerical oceupations may or may not 
require social intelligence, most sales occupations do require a high degree 
of this factor. 

3. Certain industries tend to dominate „certain families of oceupa- 
tions, i.e., occupations with a specific pattern of requirements are peculiar’ 
to one or two industries. мр 

4. 'Тһозе occupations which require only a high degree of social in- 
telligence (with varying sex and educational requirements) are, for the 
Most part, entry occupations and the relatively unskilled “meet the 
public” occupations. When mental ability requirements are combined 
with social intelligence requirements, higher skilled ‘‘meet the public" 
Occupations arise and supervisory work enters the family. 


498 Ветпата М. Вазз 


5. Quite often, a group of occupations with the same DOT classifica- 
tion numbers showed the same addend codes. One may infer that some 
relation exists between occupational requirements upon which the 
addend system is based and work done and industry upon which the 
DOT is primarily based. For many examples of the relationships that 
appear to exist between occupational requirements and work done, see 
Bass (1). 


Тће 781 occupations were described by 188 patterns, i.e., 188 out of а 
possible 384 addend codes were used to represent the 781 occupations. 
Seldom did one addend code represent more than fifteen occupations. 

"Three reasons can be advanced for failure to use all possible 384 ad- 
dend codes. The fixed structure of the addend system left 25 per cent of 
the addend codes open to college graduates, whereas only 2 per cent of 
the occupations required college graduates. Two-thirds of the addend 
codes were available for oceupations which required females or had no sex 
requirement, whereas only 41 per cent of the occupations were in these two 
eategories. Lastly, as Williams (12) found, there was a correlation be- 
tween fine accuracy and arithmetic computation so that many of the 
addend codes available for occupations with only the latter factor were 
left unused. 


Applications of Addend Classified Occupations 


-Directions for using the addend coded list of occupations have been 
set up. Following these directions, the interviewer fills out a simple 
schedule to obtain the interviewee’s basic addend code. The interviewer 
does this by placing on the schedule those addends appropriate to the 
estimated and tested abilities and interests of the interviewee. The 
interviewer, then, finds the sum of the addends on the schedule, consults 
the addend coded list and discusses with the interviewee those occupa- 
tions which have addend codes which match the interviewee’s basic 
addend code. If such occupations do not exist, or for some reason are 
unsatisfactory, then the interviewer returns to the schedule and obtains 
as many secondary addend codes as are necessary to find satisfactory 
occupations. As has been pointed out earlier, the addend system has 
been devised so that by simple addition or substraction of known quanti- 
ties (given in the directions), many secondary addend codes can be as- 
signed to the interviewee which are consistent with his abilities and 
interests. Sex and educational requirements can be manipulated and 
other requirements can be eliminated as desired. The schedule has been 
arranged to allow the interviewer to perform these operations while in 
the process of filling out the schedule. 


Addends in Occupational Classification 499 


Another interesting and important feature is inherent in the addend 
classification. Occupational families based on a single faetor or require- 
ment, or any designated combination of faetors can be easily determined 
after the occupations have been addend coded. Construction of families 
сап be done almost as if the list of addend coded occupations remained 
as а pack of speed sort cards, with the same factors present or absent. 
Now, however, instead of sorting with a needle, one sorts by mathematical 
means. 

Listed below are two of the twelve arithmetic expressions or formulae 
for selecting all occupations with a given requirement. 


1. To select all occupations which require a high degree of fine ac- 
curacy (addend 96), select all occupations whose addend codes are be- 
tween 96 and 191 and between 288 and 383 inclusive. If F represents the 
addend for fine accuracy, then the algebraic expression for obtaining a 
family of occupations which require fine accuracy becomes F,E 31: 
oF — 1, 8F,3F + 1--- 4F — 1. у ‚ 

2. To sort out all male occupations, select all oceupations whose 
addend codes are multiples of 3; namely 0, 3, 6, 9, ++- 381. The arith- 
metic expression becomes 0 X 3, 1 X 3,2 X 3,3Xx3-..127 X3. 


Any combination of factors to form a family based on a designated 
pattern of requirements is possible. For example, to select all occupa- 
tions for males which require a high degree of fine accuracy, one selects 
addend codes which can both satisfy the expression for fine accuracy as 
well as the expression for the male requirement. Thus, one would choose 
all occupations whose addend codes were multiples of 3 and were included 
between 96 and 191 and between 288 and 383. Neither addend eode 3 
nor addend code 97 satisfy both expressions, whereas 99 does and is one 
of the codes selected. 

In summary, the list of addend coded oceupations can be used as à 
“universal” occupational family by the simple expedient of constructing 
an alphabetical or DOT eross-reference index. If an interviewee has had 
successful experience in a given occupation, the interviewer can obtain 
the addend code of that occupation from a furnished cross-reference index, 
and then refer to the occupations listed according to their addend codes. 
He could find other occupations with the same or similar addend codes 
which represented occupations into which the interviewee might profitably 
transfer. 

It would seem that the addend classification of occupations in large 
plants and industries might faciliate the efficiency of transfer and pro- 
motion as well as aid in selection procedures. Add in the possibility of its 
use in coding the results of job evaluations, and the values of an addend 


500 Bernard M. Bass 


classification applied in an industrial organizational structure would 
appear to be manifold. 

By knowing an occupation or job's addend code number, a personnel 
workers, acquainted with the addend system used in his plant, would be 
able to describe immediately the pattern of factors which were required by 
that job or occupations. He would be able to do this without any need 
to refer to his office files. In conference, he might have a better command 
of the total oceupational structure of the plant, for the addend system 
would provide a good frame of reference for him. During conversations 
with workers while outside his office, the personnel worker might be 
better equipped to discuss the requirements of the workers' jobs. 

Likewise, if desired, the worker on his job would know where he stood 
in relation to other workers on other jobs. Не could have some under- 
standing of his transfer and promotion possibilities. If job evaluation 
results were addend coded, the worker might be able to understand why 
and how his job was evaluated in relation to others. 

Discussion 

Shartle (7) suggested that the individual counseling agency take 
available source material on occupations and punch the items on small- 
size speed sort cards and supplement them with notations on information 
gathered from local establishments. 

In counseling, the cards can be quickly sorted as questions arise. Suppose, 
for example, that a counselee is a high school graduate, he likes working with 
machines, and he apparently has good dexterity. Tt is possible to reveal im- 
mediately a family of occupations which possess these characteristics. . . . 

The same type of fiexible arrangement can be provided for jobs within a 
plant. The items can be taken from job descriptions and may include addi- 
tional information such as department numbers, suitability for women and 
wage rates . . . (7, p. 184) 

TThe techniques described in this present study accomplish the same 
purposes. Instead of having a collection of occupations on punched 
cards, the occupations are now coded and presented in tabular fashion. 
However, because of the peculiar qualities of the code, the information 
remains just as available as if still on punched cards. Instead of sorting 
with a needle, sorting can be done by simple arithmetie. But appreciate 
the difference between having to mass duplicate one thousand cards and 
printing one list of a thousand codes and occupations. A thousand car ds 
will be approximately a foot in thickness; a list of a thousand occupations 
takes no more than twenty typewritten pages. 

Many alternative approaches to applying addends to occupational 
classifications are possible. One could assign factors to occupations on 

on the basis of test results or expert ratings; one could use fields of work 
as the basic units to be coded; one could substitute crucial and significant 


Addends in Occupational Classification 501 
nts for factors; and one could use factors obtained originally by job 


Limitations 

— Along with the advantages of addend coding, there are several difficul- 
inherent either in addends themselves or in all approaches to sys- 
izing psychological patterns. Some of these limitations are as 
{ 1. Since addend coding involves a geometric progression, there is a limit to 
number of classification factors which can be coded before the addend 
am becomes cumbersome and the ensuing arithmetic computations become 
icated and laborious. 

|. Although only simple addition is inv 
ation, errors in computation may occur. 
the user becomes thoroughly familiari 
g, after a few days of practice. 
‘One and one seldom make two, 
In the development of addend 


olved, in the arei ay a 
This tendency may be mitiga 
with the addend structure he 


» when psychological entities are in- 
tems, as well as of any methods o 
ping of traits, one must be aware that. 

е sum of its parts. 

e tendency may increase to rely too 
advice as if it were an oracle rather 
es. No matter how accurate 
echanistic type of guidance 
used only as another tool 
e interviewee gains insight 
tunities and attempts to 


1 information will not only mean eliminating 
ly created ones, but it will also necessitate 

s. For instance, during periods of labor 
ll tend to be relaxed. As the economic 
nges to one of labor surplus, educational requirements may be greatly 


Summary 


_ 1. Information compiled by the USES, now present only on speed 
cards, accessible to very few counselors and unwieldy in the counsel- 
ituation, may be converted into easily printed and easily distributed 
which will make this information readily available in usable form 
counseling and employment centers. 
| 2. Seven variables have been coded by one numerical system. In- 
of each factor needing a symbol of its own, the factors can all be 
d into one meaningful group of one to three digits. 

3. Occupational data, when classified and coded into an addend sys- 
‘become a “universal” occupational family. Such a “universal” 
у does not have to be based on one factor, one fixed combination of 

or one occupation. Literally, hundreds of occupational families 


"within an addend coded list of occupations. 


502 Bernard M. Bass 


4. With proper use and caution, this type of classification presents a 
wealth of occupational information for efficient and speedy appraisal in 
the vocational counseling and placement situations. 

5. Addend classifications, when applied to industrial situations, may 
facilitate flexibility of organization in that it may make more efficient the 
hiring, transferring and promoting of employees. 


Received February 11, 1948 


References 


1. Bass, B. М. Application of addends to occupational classification. Unpublished 
Master's thesis, Ohio State Univ., 1947. 

2. Davis, E. W. A functional pattern technique for classifying jobs. New York: 
Teacher’s College, Columbia Univ., 1942. 

3. Dvorak, B. J. The new USES general aptitude test battery. J. appl. Psychol., 
1947, 31, 372-376. 

4. Jaspen, N. A factor study of worker characteristics. Unpublished Master's thesis, 
George Washington Univ., 1944. 

5. Michelman, C. A. A technique for classifying occupations according to the demands 
made on certain personal characteristics of workers. Unpublished Doctoral dis- 
seration, Northwestern Univ., 1942. 

6. Paterson, D. G., Gerken, C., and Hahn, M. E. The Minnesota occupational rating 
scales and counseling profile. Chicago: Science Research Associates, 1941. 

7. Shartle, C. L. Occupational information. New York: Prentice-Hall, 1946. 

8. Stead, W. H., Shartle, C. L., and Associates. Occupational counseling techniques. 
New York: American Book, 1940. 

9. Тоорз, Н. A. A contribution to the theory and technique of classification. Ohio Coll. 
Ass. Bull. No. 100, 1935 (mimeographed). 

10. Toops, Н. А. The use of addends in experimental control, social census and man- 
agerial research. Psychol. Bull., 1948, 45, 41-74. 

11. Viteles, M.S. Industrial psychology. New York: W. W. Norton, 1982. 

12. Williams, G. A factor study of the characteristics of 781 occupations in the clerical 
and sales group. Unpublished Master’s thesis, Ohio State Univ., 1947. 

18. Dictionary of Occupational Titles, Part I: Definitions of Titles. United States Em- 
ployment Service, Washington, D. C.: U. 8. Govt. Print. Off., 1989. 

14. Dictionary of Occupational Titles, Part ТУ: Entry Occupational Classification (revised), 
Washington, D. C.: U. 8. Govt. Print. Off., 1944. 

15. Procedure for Transferring Data from Master Worker Characteristics Sheets to Speed 
Sort Cards. War Manpower Comm., Washington, D. C.: Job Family Series, 
Noy. 1944, 

16. Processing Job and Worker Data for Job Families. War Manpower Comm., Wash- 
ington, D.C.: Job Family Series, May, 1943. 

17. Training and Reference Manual for Job Analysis. War Manpower Comm., Wash- 
ington, D. C.: Job Family Series, June, 1944. 


Effects of High Altitude on Speech Intelligibility t 


K. D. Kryter 
Washington University, St. Louis, Mo. 


Smith and Seitz (6) and Smith (7) discussed in recent articles the 
deleterious effects of anoxia on the intelligibility of speech presented over 
an interphone. They produced anoxia by reducing in a chamber the 
content of oxygen to correspond to the oxygen content found at altitudes 
ranging up to 20,100 feet. They deduced from their findings that during, 
flight at altitudes above 13,600 feet oxygen equipment should be used in 
order to keep interphone communication as effective as it is at sea level. 

Anoxia with subsequent “wandering of attention" on the part of the 
subjects is, however, only one of the ways in which increased altitude can 
lessen the intelligibility of interphone communication; in addition, the 
reduction of pressure at increased altitude can lower speech intelligibility 
over an interphone by affecting adversely the operation of the voice, the 
microphone and the earphones. Indeed, with the effort exerted by the 
talker kept constant it is found (5) that at 20,000 feet the speech signal 
reaching the listerner over his earphones is 4 the pressure the signal woald 
be at sea level, and at 40,000 feet it is reduced in intensity to about 1/16 
its sea level pressure, because of decreased sensitivity of microphone and 
earphones and loss in speech intensity. These losses are severe enough 
to depress speech communication and are, of course, independent of 
wheather or not oxygen equipment to prevent anoxia is used. 

Since Smith and Seitz simulated what they called “high altitude” 
merely by a reduction in the oxygen content in a chamber, their findings 
do not reflect the true course of speech communication efficiency аз & 
function of altitude. 

The present study was addressed to the problem of the effects of true 
altitude from 5000 to 35,000 feet above sea level on speech intelligibility. 
Tests were conducted over a standard U. 8. Army Aircraft interphone 
during flight in B-17F bombers. Anoxia was prevented in the present 
work by the use of oxygen equipment. 

* The data in this paper are taken from a joint Aircraft Radio Laboratory—Office of 
Scientific Research and Development report (2) issued during the war. The research 
was completed at Eglin Field, Florida, in the summer of 1944 under Contract OEMsr- 
658 between the Office of Scientific Research and Development and Psycho-Acoustic 
Laboratory, Harvard University. 

503 


504 K. D. Kryter 


Since Smith and Seitz evaluated only the effects of anoxia, their work 
and the present study can be considered as complementary for the pur- 
pose of estimating the possible effects of flight at high latitude (up to 
20,100 feet) without oxygen equipment upon speech intelligibility. 


Test Procedures 


The principal method used in quantifying the effectiveness of inter- 
phone communication was the word articulation test. This method 
provides a measure of intelligibility, per cent word articulation, based on 
the percentage of discrete words heard correctly over a particular circuit 
under given conditions. In a typical atticulation test, a talker reads a 

_ list of monosyllabic words over an interphone to a group of listeners. He 
„places each test word in a carrier sentence, as “You will write—.” 
The listeners record the words as they hear them, and the percentage of 
the words recorded correctly by the group of listeners constitutes the 
articulation score. For a complete account of speech articulation testing 
methods developed at the Psycho-Acoustic Laboratory during the war 
the reader is referred to an article by Egan (1). 

Personnel. In many respects the articulation testing crew was a very 
heterogeneous group, and, by the same token, representative of airplane 
crews in general. The crew members were Army Air Forces personnel, 
average or above in intelligence, and qualified for high-altitude flying. 
"The men were trained daily for several weeks in the ground laboratory. 
During the training, they served both as speakers and as listeners and 
used all of the standard types of interphone equipment. Thus they be- 
came familiar with one another’s peculiarities of speech, and they learned 
to pronounce and to spell the words and to use the interphone equipment. 

Following the training, the men were fitted with oxygen masks and 
were given experience with oxygen equipment in an altitude chamber. 

Speech Material and Delivery. The words used in the articulation 
tests were arranged in lists of 50, drawn from a total vocabulary of 1200 
monosyllabic words in such a way that the individual lists had the same 
proportionate phonetic composition as everyday speech and were, 88 
determined by actual test, essentially equivalent in difficulty. The order 
of words in each of the 24 master lists was rearranged 20 times to provide 
480 test lists. Scores obtained with one test list can be compared directly 

_with scores obtained with any other. 

The instruction to each of the speakers before each flight was simply 
that he should make himself as intelligible as possible while reading tests. 
The sound of his own speech in his earphones was his only clue as to how 
intelligibly he was speaking and as to what signal strength he was pro- 
ducing. Thus, an effort was made to have each speaker use the inter- 


Effects of High Altitude on Speech Intelligibility 505 


phone just as he would on а bombing mission. The talkers paced them- 
selves at a rate which allowed the listeners to record the words easily; on 
the average, the interval between test words was approximately three 
seconds. 

Flight Plan. B-17F (Flying Fortress) bombers were used for all the 
flight experiments. 

On each flight, the articulation erew consisted of seven or eight men, 
depending upon the number of oxygen stations available in а particular 
B-17F. Two experimenters accompanied the crew on each flight, one 
supervising the tests from the radio compartment, the other keeping 
records and making physical measurements in either the bombardier’s 
or the tail gunner’s compartment. The men in the waist, tail, and the 
ball turret had electrically heated flying suits, while the remainder wore 
fleece-lined clothing. 

Unless the flights were interrupted by engine trouble, fuel shortage, 
oxygen leaks, or the bends, it was possible to follow, approximately, a 
flight plan which involved: 


Take off and ascent to 5,000 feet 


Level flight at 5,000 feet. ...... n n 68 1 hour 
Ascent from 5,000 to 35,000 {её&.............---- 1 hour 
Level flight at 35,000 feet. ..... enn 1 hour 
Descent from 35,000 feet to sea level......... ++ 1 hour 


Inasmuch as 35,000! feet was near the ceiling of the planes in which the 
tests were conducted, it was necessary on some of the flights to conduct 
the high-altitude tests at 33,000 feet, whereas on others it was possible to 
go as high as 37,000 feet and to remain at high altitude for as long as 1 
hour and 35 minutes. | 
Interphone Equipment. In order to determine the functional relation 
between altitude and intelligibility, articulation tests were conducted 
during ascent and descent, using a system consisting of the following 
standard Army Air Force components: (1) AAF oxygen mask (A-14); 
(2) Air-activated carbon microphone (AN B-M-Cl); (3) Fixed-gain 
amplifier (BC-347-C); (4) Magnetic earphones (ANB-H-1); and (5) 
Helmet (AN-H-14) with noise excluding earphone sockets (AN-48440-1). 


Results and Discussion 


The ascents and descents of 13 high-altitude flights were devoted to 
the determination of the relation between intelligibility and altitude (2). 
On each flight, from two to four members of the testing crew took turns 

! The altitudes mentioned in this report are “pressure” altitudes. At the tempera- 
tures encountered during the high-altitude tests, а pressure altitude of 33,000 feet cor- 
Tesponds to a “tape-line”’ altitude of 35,000 feet. 


506 K. D. Kryter 


serving as talkers, reading the test lists to the other members who served 
as listeners. Individual curves were plotted for each talker, one for 
ascent and one for descent. From the individual curves, articulation 
scores for altitudes between 5000 and 35,000 feet were obtained by inter- 
polation at 5000-foot intervals. Summary curves, shown in Figure 1, 
were then based on the interpolated values. The curves show that 
intelligibility falls off gradually up to approximately 25,000 feet, then 
declines more rapidly. 

It is of interest to note, in Figure 1, that the articulation scores ob- 
tained during descent were higher, on the average, than those obtained at 


ч 
сл 


~ 
о 


o 
әл 


әл 
л 


PER CENT WORD ARTICULATION 
a o 
coe о 


о 10 20 30 


ALTITUDE IN THOUSANDS OF FEET 


Fro. 1. Intelligibility of speech as a function of altitude. 
The dashed curves, based on 29 pairs of individual functions for 12 talkers, show the percentages of the 


byl M understood correctly at various altitudes during ascent and descent. The solid line representa 


the same altitudes during ascent. This “hysteresis” effect шау be 
explained at least in part as follows: 


(a) The members of the test crew appeared to get a “lift” from the 
completion of the high-altitude tests. It was good to be alive and on the 
way down. The slightly improved performance during descent may 
therefore have been the result of increased motivation. 

(b) The noise level in the plane was somewhat higher during ascent 
than during descent; more power was required for climbing than for 
descending. 


Effects of High Altitude on Speech Intelligibility 507 


The fact that the scores obtained during the last half of the testing 
schedule were no lower for comparable altitudes than those obtained 
during ascent indicates that the decrement in intelligibility scores as & 
function of altitude cannot be attributed to fatigue, or boredom. 

It has been amply demonstrated (1) that the measurement of changes 
due to variations of stress upon the communication system in speech 
intelligibility is dependent to a large degree on the following factors: 


(1) The type of speech material used in the tests. _ Sentence intel- 
ligibility, for example, is less susceptible to deterioration than word 
intelligibility. 

(2) The interphone system employed. The better the system the 
higher will be the speech intelligibility scores and the greater must be the 
stress before they are adversely affected. 


The measured changes in intelligibility probably would have been 
different, therefore, had a different type of test material or a different 
type of interphone been employed. The discrete monosyllabic word 
articulation test was chosen for this study because: (a) it is a more severe 
and hence a more sensitive test of communication under conditions of 
stress, and (b) military communication under battle conditions often 


IN DB RE LO VOLT 


VOLTAGE 


0 10 20 30 
ALTITUDE IN THOUSANDS OF FEET 
Fic. 2. Signal intensity аз a function of altitude. 


The i i ion tests conducted during ascent and descent, the intensity of the 
speech sient followed a tn the ary veel to that ‘of the articulation scores (cf. Figure 1).. The meas- 
Urements of signal intensity were ий йе with a voltmeter (VU meter) connected across the headsets. Thus 
they reflect the drop in voice level with increasing altitude and, in addition, the change in the sensitivity 

microphone. 


508 K. D. Kryter 


o 
o 


"YOU WILL WRITE--" 


A 
o 


=ч 


ке ЧИ 


o 
o 


SS 


PONE TWO LOSS SEVEN. S, 


> 
o 


PER CENT WORD ARTICULATION 
о! a 
o o 


о 10 20 30 


ALTITUDE IN THOUSANDS OF FEET 


Lo А А Í 7 F 

Fre. 3. Effect of substituting a more difficult carrier phase for “You will write ——. 
The solid curve shows the relation between intelligibility and altitude as determined with the standard 
Күрше eon tence. (This function starts at S mighty higher level of intelligibility than that shown in Figure 
і [ees use different speakers were in obtaining the data presented in Figure 1 and those shown 
in 3.) The dashed curve shows the relation between the same variables as determined under the 

same conditions, except that the more demanding carrier phrase, ‘‘One, two, three, four, five, — ~, seven, 
Е sd, these tests 6 talkers read tests during ascent only, using alternately the standard and longer 


consists of short phrases or words. The particular interphone used 
represents the best aircraft interphone equipment available for military 
aircraft at the time the tests were conducted. 

Signal Intensity. During the tests described above, one of the 
experimenters recorded, for as many of the test words as possible, the 
deflection of a voltmeter (VU meter) connected across the headset line. 
Average curves, summarizing these measurements of signal intensity, 
are shown in Figure 2. The decrease in signal intensity with increase in 
altitude is attributable to reduction in voice level, to reduction in micro- 
phone sensitivity, or to both. Although it is not possible to apportion 
the decrement accurately as between the voice and the microphone, 
comparison of the curves of Figure 2 with data (5) obtained in altitude- 
chamber experiments indicates clearly that, during actual flight, the 
talkers compensated for the difficulties of speaking at high altitude by 
increasing vocal effort. The altitude-chamber data indicate that, with 
constant effort, the drop in voice level between 5000 and 33,000 feet 18 
7 or 8 decibels, whereas the decrement shown in Figure 3 is only а little 
over 4 decibels. Thus the decrement due to the voice (and microphone) 


Effects of High Altitude on Speech Intelligibility 509 


in actual flight is considerably less than the decrement due to the voice 
alone with constant effort. 

There is a parallel between the functions relating intelligibility to 
altitude (Figure 1) and the functions relating signal intensity to altitude 
(Figure 2). This parallel would suggest that the difficulty in communica- 
tion at high altitudes results at least in part from the weakness of the 
speech signal delivered to the listeners’ earphones. Actually, since the 
earphones themselves decrease in sensitivity with increasing altitude (a 
drop of around 10 decibels at 35,000 feet), the decline in signal intensity 
at the listeners’ ears is more marked than the curves of Figure 2 suggest. 

In order to determine how much increasing the signal intensity would 
improve high-altitude communication, further experiments were con- 
ducted with a variable-gain interphone amplifier (3, 4). In these tests, 
the gain (and consequently the intensity of the signal at the listeners’ 
ears) was varied systematically. The optimal gain for high-altitude 
proved to be approximately 12 decibels greater than that required at 
5000 feet. 

The decrease in signal intensity is, of course, not the only factor | 
contributing to the reduction of intelligibility at high altitude. Other 
factors, such as alteration of the frequency characteristics of the inter- 
phone components at high altitudes, may have further detrimental 
effects upon the intelligibility of interphone communication. 

Sentence Length. It was noted that, at high altitude, it was more 
difficult to speak in sentences than it was to utter isolated words or 
phrases. The talkers resisted instructions to enunciate the carrier, 
“You will write,” as carefully as the key test words, and tended to save 
their breath for the latter. It was thought, therefore, that the articula- 
tion scores obtained with the short carrier sentence might not reflect the 
difficulty of carrying on extended conversations over interphones at high 
altitudes. 

In order to study this question, а balanced series of tests was con- 
ducted, during ascent only, with six talkers. Each of the six talkers 

‘served on two flights, using the short carrier sentence, “You will write 
—,” on one and the long carrier phrase, “One, two, three, four, five, 
—, seven,” on the other. The speakers were instructed, and reminded 
frequently, to stress equally all seven words of the longer phrase, giving 
no more and no less emphasis to the test word which was inserted in the 
place of “six” than to the numerals, The whole phrase was to be said 
With a single breath. For these tests, 21-word lists were used instead of 
the 50-word lists described above. ER 

The summary curves, shown in Figure 3, indicate that intelligibility 
Was consistently lower with the long carrier than it was with the short 


510 K. D. Kryter 


one. More important, the scores obtained with “Опе, two, three, four, 
five, ——, seven," declined more rapidly with increasing altitude than did 
those obtained with “You will write ——.” The decrement between 
5000 feet and 35,000 feet was 21 articulation points with the short 
carrier, 33 articulation points with the long carrier. 

It was thought that an explanation for the disproportionate difficulty 
of speaking in longer sentences would be found in the measurements of 
speech intensity which were made during the tests, that speech intensity 
would decline more rapidly as а function of altitude with the long carrier 
than with the short one. The measurements indicated (Figure 4), 


"YOU WILL WRITE---." 


"4 


NS 


~~ 


“ONE, TWO,...,---, SEVE №" 


VOLTAGE IN ОВ RE 1.0 VOLT 


о 10 20 30 
ALTITUDE IN THOUSANDS OF FEET . 


Fic. 4. Signal intensity with standard and long carrier phrase. 


Showing the relation between altitud speech i itoring, records of 
articulation tests tead with i cb Y. теран Ee ved from jer phrase, "One, two 


however, that the key word in “One, two, three, four, five, —, seven” 
was on the average, at 35,000 feet as well as at 5000 feet, about 2 decibels 
weaker than the key word in “You will write ——.” It appears, there- 
fore, that speech intensity did not decline any more rapidly with the long 
carrier than with the short, but that, in endeavoring to maintain adequate 
intensity level with the long carrier, the talkers relaxed in their efforts to 
enunciate clearly. | 
Summary and Conclusion 


Intelligibility of discrete words over a standard aircraft interphone is 
shown to decrease during flight from about 657% words correctly heard а 


Effects of High Altitude on Speech Intelligibility 511 


| feet to near 40% (Figure 1) at 35,000 feet. This deterioration in 
h intelligibility is attributable to a depression in the operating 
ency of the voice, the microphone and the earphones as the result of 
reduced pressures encountered at high altitude. Anoxia did not play 
Ле since oxygen equipment insured an adequate supply of oxygen to 
bjects at all altitudes. | 
"The facts show that if speech intelligibility is to be kept at sea level 
low altitude efficiency, which is under some circumstances none too 
h to begin with, adequate compensation in terms of additional inter- 
пе amplifier gains must be provided at high altitudes. 
шей February 11, 1948. 
E References* 
n, J. P. Articulation testing methods—II. Washington, D. C.: Publications 
" Board, U. 8. Dept. of Commerce, OSRD Report No. 3802, Report PB 27848, 
_ 1 November 1944. 
Kryter, K. D. ‘Articulation-test comparison of siz signal corps (aircraft) interphones 
at low and high altitudes. Washington, D. C.: Publications Board, U. 8. Dept. of 
Commerce, ARL Memorandum Report No. 151, OSRD Report No. 1974. Re- 


port PB 22909. 1 March 1944. 
. Licklider, J. C. К. Voltage gain and power-output capability requirements for high- 
- altitude interphone amplifiers. Washington, D. C.: Publications Board, U. 8. 
__ Dept. of Commerce, ARL Memorandum Report No. 158, OSRD Report No. 
1975. Report PB 22911. 10 March 1944. 
eklider, J. C. R., and Kryter, К. D. ‘Articulation tests of standard and modified 
interphones conducted during flight at 5000 and 35,000 feet. Washington, D. C.: 
Publications Board, U. 8. Dept. of Commerce, ARL Memorandum Report No. 
149, OSRD Report No. 1976. Report PB 5505. 1 July 1944. 5 
ludmose, Н. W., et al. Effects of high altitude on the human voice. Washington, 
D. C.: Publications Board, U. 8. Dept. of Commerce, OSRD Report No. 3106. 
30 January 1944. | 
h, G. M., and Seitz, C. P. Speech intelligibility under various degrees of anoxia. 
J. appl. Psychol., 1946, 30, 182-191. GA dit 
h,G. M. The effect of prolonged mild anoxia on speech intelligibility. J. appl. 


Ascoustic Laboratory, Harvard University, and ‘Aircraft Radio Laboratory, 
Field, high altitude project at Eglin Field, 


The Interpretation of Interest Profiles 


Solomon Diamond 


Veterans Administration Guidance Center (Los Angeles), 
University of California, University Extension 


Every counselor in a. Veterans Administration Guidance Center is 
faced, from time to time, in the routine performance of his duties, with 
the need to review an advisement which had been completed at an earlier 
date and perhaps in a distant city, which for one reason or another had 
proved unsatisfactory with the passage of time. At such times, one 
studies the earlier record to see what might be learnt from it for one’s 
own future work. The decision to write this article came . it of the 
writer's conviction that in a number of such cases there seemed to have 
been а too complete reliance on the high points of the interest profile. 
For example, in a recent case, a veteran who had abandoned the course 
of professional training that had been selected in the original advisement, 
came to ask confirmation for another, which would be academically less 
demanding. The new objective was still unsuitable. Both were related 
to the high points of the interest profile, which, as is often the case, is 
better remembered by the client than any other tests. This veteran had 
really excellent mechanical aptitude, but a low average score in mechani- 
cal interest. For this reason, mechanical objectives had not been seri- 
ously considered. The first two sessions were taken up largely with 
discussion of reasons why the veteran was so desirous of studying for a 
professional objective. In the third, he announced with enthusiasm that 
he had found'a really excellent training opportunity, for a mechanical 
trade. In what follows, we shall try to show why, in many such cases, 
the lower score for mechanical or commercial interest should not be 
regarded as a serious deterrent to success, and to offer data which will 
assist in the more useful interpretation of interest profiles. 

We shall try to show that the customary almost exclusive concern 
with the high points of the profile fails to take account of certain necessary 
implications which arise from the fact that the different occupational 
fields represented by the separate scales of an interest inventory offer 
employment to widely different numbers of workers. Every counselor 
knows that other facts in the record, such as aptitude scores, may influ- 
ence the weight to be given to interest scores. It is our intention to show 
that, other things being equal, an average score on one interest scale may 
represent a more positive indication for entry into the corresponding 

512 


The Interpretation of Interest Profiles 513 


feld than а score that is considerably and quite reliably above average 
in another field. 

Тће Kuder Preference Record has been found to be the most commonly 
used of all tests in guidance centers (1) Its manual (4) states: “Look 
over the profile to see which scores are above the 75th percentile. . . . 
If there are no scores above the 75th percentile, scores above the 65th 
percentile should be inspected." A footnote adds: * 1, There is prob- 
ably little to be gained by attempting to set up separate cutting points 
for each scale, and there is much to be lost in convenience of interpreta- 
tion. . . . Scores well above the 75th percentile can be regarded with 
greater confidence. Those somewhat below it may deserve some con- 
sideration but must be regarded as less likely to be an expression of a 
true interest in the field.” 

In these recommendations, there is an implicit assumption that equal 
rankings on the different scales represent equal intensities of interest. 
It is implied that a score at the 65th percentile on one scale represents less 
interest in that field than a score at the 85th percentile on another scale. 
This is not necessarily true, although it might hold for a given interest 
inventory. It is obvious that a low rank in one field may represent 
greater interest than a higher rank in another, less popular field. 

Suppose, for example, that our interest inventory included a scale for 
Parenthood. The high-school girl whose score is at the 25th percentile 
for Parenthood may have a rather intense interest in the prospect of 
marriage, children, and home management, and the probability is high 
that she will in time become a mother. Unless we are ready to recom- 
mend wholesale revision of our social institutions, we should have to set 
our cutting point on this scale below the 25th percentile. This would be 
very obvious, if we dealt with an interest scale for Parenthood, and dis- 
cussed its implications with our clients. The principle is the same for 
other fields, for which we do measure interest, and which we do discuss 
with out clients. 

Not 25 per cent of employed urban men, but approximately 40 per 
cent, are engaged in occupations of a mechanical nature. It is, therefore, 
a statistical absurdity to expect that all the men who enter the mechanical 
field shall have mechanical interest above the 75th percentile. It would 
be an economic absurdity to suppose that we could create a revolution in 
our industrial organization, so that the proportion of workers required 
for mechanical occupations should conform to our preconceived idea of 
the proper level of interest for such work." 

1 Nor would we want to make such a change. In counseling for parenthood, we would 
want to make better parents, not fewer. In counseling for occupational choice, we would 
want to make better mechanics, taking pride and pleasure in their work, not fewer. Is 
not the low score for mechanical interest too often an indicator of personal fellings of 
inadequacy? 


514 Solomon Diamond 


Table 1 


Distribution of Employed Men and Women, According to Interest Categories 
of the Kuder Preference Record, Based on U. S. Census Data for 1940 


Interest Category Men Women 
Mechanical 38.0 1.6 
Computational 2.5 7.6 
Scientific 4.0 0.8 
Persuasive 34.7 21.8 
Artistic 2.9 6.9 
Literary 0.5 1.0 
Musical 0.4 12 
Social Service 2.9 20.7 
Clerical ў 18.6 38.5 


In Table 1, we have summarized, according to the interest categories 
of the Kuder Preference Record, 1940 Census employment data for al- 
most 200 leading occupations (7), embracing about 18 million men and 6 
million women. In assigning each occupation to a single field, although 
two or more categories might appear relevant, the suggestive lists in the 
manual were followed as closely as possible. In preparing this summary, 
we omitted occupations in agriculture, in mining, and the merchant 
marine; the basic table did not include members of the armed forces, 
urban laborers, or unskilled operative workers. In this way we avoided 
any overloading of the mechanical category. Мо assignment could be 
made for the professional and semi-professional workers who were listed 
аз “not elsewhere classified," but these represent only about 0.7 per cent 
of all men, and about 0.6 per cent of all women, so that their exclusion 
does not produce any important distortion. 

Table 2 presents similar data from another source, using the interest 
categories of the Occupational Interest Inventory (5). It is based on a 
mimeographed publication of the Los Angeles City Schools District (3), 


Table 2 
Distribution of Men and Women Workers According to Categories of the Occupational 
Interest Inventory, Based on Los Angeles City Directory of 1940 


a 


Interest, Category Men Women 
Personal-Social 13.56 30.0 
Natural 1.50 0.30 
Mechanical 39.95 16.16 
Business 36.75 49.04 
The Arts 8.82 3.58 
Тће Өсїепсев 2.99 0.92 


n——À 5 cH "dt E CL LOS, ИВ ВИНУ ______.. 


The Interpretation of Interest Profiles 515 


which lists the numbers engaged in 400 occupations, according to the Los 
Angeles City Directory of 1940. The assignment of each occupation to 
an interest category is made in that listing. 

The salient fact which appears from either table is that some fields of 
work engage so large a proportion of men or women, as the case may be, 
that it is mathematically impossible for all those engaged to score above 
the 75th percentile of the general population, in interest. for those fields. 
Other fields employ so small a proportion of workers that, if we assume 
strength of interest to be an important factor in selection, we must 
assume that entrants will, as a rule, score far above the 75th percentile. 

If we were to base our prediction of success (and satisfaction) solely 
upon measured interest, we would select a different cutting point for each 
scale, guided primarily by these employment figures. These cutting 
points might vary between the 60th percentile and the 95th. However, 
we would still be faced frequently with the need to interpret interest 
profiles in which none of the scores reached the cutting points set up. In 
interpreting such profiles, it would be necessary to make some assump- 
tion as to the degree of correlation existing between measured interest 
and achievement. 

It is surely not too pessimistic to assume that this correlation, in any 
given field, would not be higher than 0.40.* Without any implication 
that this is a correct figure, we may use it for illustrative purposes. Using 
conventional regression formulae, we can determine the most probable 
percentile rank in achievement, to be predicted on the basis of any given 
percentile rank in measured interest. To do this most easily, we have 
taken advantage of a published Prediction Table (2), making the neces- 
sary conversions between percentile ranks and standard scores. The 
result appears in Table 3 (where all figures are approximate). 

In Table 3, Column 1 gives the interest percentile, and Column 2 the 
most probable achievement percentile, assuming 0.40 correlation between 
measured interest and achievement. Columns 3 and 4 give percentile 
tanks which correspond to the upper and lower limits of the standard 
error of estimate for this prediction. Hence, an individual whose score 
for mechanical interest is at the 31st percentile, will most probably attain 
the 42nd percentile in actual achievement, while in 16 cases out of 100 he ` 
will attain the 76th percentile or higher in achievement. At the 76th 
Percentile, he would be excelled in mechanical skill by 24 per cent of the 
general population, or roughly two-thirds of those engaged in mechanical 

* This figure is higher than almost all of the correlations reported in investigations of 
the relationship between Kuder scores and academic achievement in related fields, as 
summarized by Super (6). However, it might be argued that these correlations have 
been attenuated by lack of full reliability in the criteria of achievement. For the sound- 
hess of our argument, it is better that we err on the optimistic side. 


‘ 


516 Solomon Diamond 


Table 3 


Most Probable Achievement Predicted on the Basis of a Single Interest Measure Whose 
Correlation with Achievement is Assumed to be 0.40, Together With Points at the 
Upper and Lower Limits of the Standard Error of Estimate of this Prediction 


(All entries are percentile ranks) 

Measured Probable Plus Minus 
Interest. Achievement 18.р. 18.р. 
Св 
EX 16 47 3 
2 21 55 4 
d 27 63 6 
16 34 70 9 
31 42 76 13 
50 50 82 18 
69 58 87 24 
84 66 90 30 
93 73 93 37 
98 79 96 45 
99 84 97 53 


work. Ifthe same young man’s score on the literary scale is at the 84th 
percentile, his most probable literary achievement would be at the 66th 
percentile, and he has 16 chances in 100 of attaining the 90th percentile— 
at which point 10 per cent of the general population, or 20 times more 
than the number employed in literary work (refer to Table 1), will excel 
him. Hence, if we have only the interest scores and the occupational 
statistics to guide us, 3186 percentile interest in the Mechanical gives à 
better chance of success than 84th percentile interest in the Literary. 

The need for separate cutting points on the different scales can also 
be shown in another way. If interest is an important selective factor, it 
must follow that persons engaged in such highly selected fields as litera- 
ture, music, and the sciences will, in general, make higher interest scores, 
in their fields, than mechanics or clerks will make in their fields. That 
is, the average rank of musicians in musical interest must be much higher 
than the average rank of mechanics in mechanical interest. 

This hypothesis can be tested with data that are presented in the 
Kuder manual, for a variety of occupational groups of men and women. 
For each of these groups, we are given the mean raw score and standard 
deviation for each scale, to be compared with similar data for a large 
“base group,” which we shall refer to hereafter as the general population. 
Assuming normal distribution of the scores of each group on each scale, 
it is possible to determine the proportion of members of the given oc- 
cupational group exceeding a given cutting score in the general popula- 
tion. Table 4 states the proportion of members of certain occupational 


2 


The Interpretation of Interest Profiles 517 


groups exceeding the 75th percentile on certain relevant interest scales. 
For example, only 38 per cent of the sample group of mechanics and re- 
pairmen exceed the 75th percentile of the general population in mechani- 
cal interest, while 98 per cent of the sample group of musicians exceed 
the same point on the musical scale. A musician with musical interest 
at the 75th percentile is 2.16 S.D. below the mean for his occupational 
group, while a mechanie with mechanical interest at the 40th percentile 
is only 1.37 S.D. below the mean for his occupational group. 

Table 4 is a test of the proposal that the 75th percentile shall be used 
as а cutting point on all interest scales. Using the same data and 
methods, we can determine new cutting points, by uniformly applying 
the assumption that the entrant into a given occupation is adequately 


Table 4 


Proportion of Members of Certain Occupational Groups Exceeding the 75th Percentile 
on Relevant Interest Scales of the Kuder Preference Record 


Per Cent above 75th 
Occupational Group Interest Scale Percentile of Base Group 
ү, осирадора “ер т 0 шыдыр ешн ашыгы ы 
Men: 
Mechanics & repairmen Mechanical 38 
Manufacturing foremen Mechanical 49 
Aviators Mechanical 60 
Bookkeepers & cashiers Comuptational 63 
Bookkeepers & cashiers Clerical 65 
Salesmen to consumers Persuasive 66 
Securities salesmen Persuasive 81 
HS. teachers of com- 
mercial subjects Clerical 66 
Accountants ` Computational 75 
Authors, editors, & ; 
reporters Literary 87 
Chemists Scientific 88 
Social workers Social 78 
H.S. teachers of English Literary 93 
Musicians & music teachers Musical 98 
Women: 
Sales clerks Clerical 33 
Sales clerks Persuasive 4l 
Bookkeepers & cashiers Clerical 33 
Bookkeepers & cashiers Computational 44 
Laboratory technicians Mechanical 60 
Laboratory technicians Scientific 76 
Occupational therapists Mechanical 14 
Artists & art teachers Artistic 97 
Musicians & music teachers Musical 98.5 


_ Musicians & musio teachers) / «SE BS 


518 Solomon Diamond 


Table 5 
Base Group Percentiles for Scores at 25th Percentile of Occupational Group Samples, for 


АП Scales on Which There is a Positive Difference Significant at the 1 Per Cent Level. 
Based on Data in Kuder Manual, Assuming Normal Distributions for All Groups 


Occupational Group Interest Scales and Percentiles 
Men: ` 
Accountants & auditors 78 Computational, 50 Clerical, 42 Literary 
Actors 84 Musical, 75 Literary, 50 Artistic 


Authors, editors, and reporters 
Chemists 


87 Literary, 46 Musical, 40 Persuasive 
84 Scientific 


Clergymen 76 Social, 67 Literary, 45 Musical 
Electrical engineers 71 Scientific, 60 Mechanical 
Industrial engineers 69 Scientific, 68 Mechanical, 53 Computational 
Mechanical engineers 62 Mechanical, 54 Scientific 
Lawyers and judges 72 Literary 
Musicians and music teachers 96 Musical, 41 Literary 
Drugstore managers and 

pharmacists 50 Scientific, 48 Persuasive 
Social workers 78 Social, 43 Literary 
National Service Officers of 

Veterans’ Organizations 74 Social 
Secondary school teachers 50 Social, 33 Literary 


H.S. teachers of commercial 
subjects 

H.S. teachers of English 

HS. teachers of mathematics 

H.S. teachers of social studies 


66 Clerical, 35 Computational 
91 Literary 

60 Scientific, 55 Computational 
70 Social, 50 Literary 


School administrators 60 Social 
Meteorologists 68 Scientific, 41 Literary, 38 Mechanical 
District supervisors of Voca- 
tional Rehabilitation 85 Social 
Personnel managers 46 Persuasive, 41 Social, 39 Literary 
Aviators 69 Scientific, 68 Mechanical 
Draftsmen 57 Artistic, 45 Mechanical 
Physical education instructors 74 Social 
Radio field engineers 69 Scientific, 55 Mechanical, 50 Persuasive 
Weather observers 62 Scientific, 43 Computational, 35 Literary 
Retail managers 52 Persuasive 
Production managers 45 Mechanical 
Sales managers 70 Persuasive 
Business managers 40 Computational, 40 Literary, 31 Clerical 
Bookkeepers and cashiers 62 Clerical, 60 Computational 
Insurance agents and under- 
writers 62 Persuasive 
Securities salesmen 80 Persuasive, 40 Social 
Salesmen to consumers 67 Persuasive 
Salesmen and agents except 


to consumers 


64 Persuasive, 49 Clerical 


The Interpretation ој Interest Profiles 519 


Table 5 (Cont.) 


| Men: 
Mechanics and repairmen 
Manufacturing foremen 
Steel manufacturing foremen 
Women: 
Artists and Art teachers 
Journalists 
Home demonstration agents 
Librarians 
Musicians and musie teachers 
Physicians 
Primary school teachers 
H.S. teachers of commercial 
subjects , 
H.S. teachers of English 
H.S. teachers of home 
economics 
H.S. teachers of mathematics 
H.S. teachers of languages 
Occupational therapists 
Trained nurses 
Religious workers 
Hospital dieticians 
Laboratory technicians 
"Tearoom and restaurant 
managers 
Retail buyers 
Bookkeepers and cashiers 
Bookkeepers 
General office clerks 
Personnel clerks 
| I.B.M. operators 
Statistical clerks 


Stenographers and typists 
Sales clerks 


56 Mechanical, 49 Scientific 
58 Mechanical, 45 Scientific 
61 Mechanical, 43 Scientific 


93 Artistic 

64 Literary 

49 Social 

63 Literary, 49 Artistic 

93 Musical, 46 Artistic 

76 Scientific, 51 Mechanical, 48 Social 
47 Artistic 


59 Computational, 53 Clerical 
62 Literary, 39 Musical 


36 Artistic, 35 Mechanical 

73 Computational, 51 Mechanical 

66 Literary 

74 Mechanical, 65 Artistic, 39 Musical 
52 Social, 42 Scientific 

83 Social 

46 Scientific 

75 Scientific, 64 Mechanical 


51 Computational 

78 Persuasive 

41 Computational 

56 Computational 

40 Clerical, 33 Computational 

61 Persuasive 

61 Computational, 41 Mechanical 
47 Computational 

38 Clerical 

58 Social 


suited to it if his measured interest exceeds that of one-fourth of the 
Occupational group. This has been done in Table 5, which states the 
Percentile rank in the base group (general population) for scores at the 
25th percentile of the occupational group; this is given for each scale on 
which there is a positive difference from the base group significant at the 
1 per cent level. To facilitate study of this table side by side with the 
data on which it is based, in the Kuder manual, the original order of 
listing has been preserved. The reader will observe that there are 
relatively few instances in which cutting points so determined are as 
high'as the 75th percentile. 


520 Solomon Diamond 


Our various Tables reinforee one another in pointing to the j 
that lies in assuming that the high points of a profile necessarily point | 
the most appropriate fields for specialization. “АП other things bei 
equal," i.e. without special appeal to inequalities of aptitude tests, ever 
the low average score in one field may give greater chance of success Һат 
the high average score in another. If this seems to be suggesting 4 
career of monotony for many workers, let us remember that most me 
get real satisfaction out of mechanical tinkering, but are given liti 
opportunity to satisfy this interest. The low average score in mechai 
work may indeed be analogous to the low average interest of a high 
girl in motherhood. Neither will disqualify the individual from а v 
and satisfying career. 
Received June 21, 1948. 

Early publication. 


References 


1. Berkshire, J. R., Bugental, J. F. T., Cassens, F. P., and Edgerton, Н. A. Test pref 
erences in guidance centers. Occupations, 1948, 26, 337-343. 

2. Bingham, W. У. Aptitudes and aptitude testing. New York: Harper & Brothe! 
1937. 


3. Curriculum Division, Los Angeles City Schools District. Major occupations in th 
city of Los Angeles. (Vocational Guidance Series No. XV, Curriculum Public: 
tion SC-330, 1947.)  Mimeographed. 

4. Kuder, G. F. Revised manual for the Kuder preference record. Chicago: Se 
Research Associates, 1946. Е 

5. Lee, E. А., and Thorpe, І. Р. Occupational interest inventory. Los Angeles: 
fornia Test Bureau. 

6. Super, D. E. The Kuder preference record in vocational diagnosis. J. со 

Psych., 1947, 11, 184-193. 

7. United States Department of Labor. Occupational data for counselors. 

No.817. Washington, D. C.: Government Printing Office, 1945. 


The Kuder Interest Test Patterns of Fire 
Protection Engineers * 


George S. Speer 
Institute for Psychological Services, Illinois Institute of Technology 


As a part of orientation testing, the Kuder Preference Record is 
administered to all students entering Illinois Institute of Technology. 
This inventory is designed to measure the vocational preferences or 
interests of the individual, and to furnish a ready and objective indication 
of the type of activity in which he would probably achieve the greatest 
satisfaction. 

These interest scores, together with various achievement and aptitude 
test scores, are furnished to each counselor so that the freshman student 
has an appraisal of his interests as well as his ability. Thus, with the 
counselor, he may plan his career more wisely, selecting the curriculum to 
which he appears best adapted. 

A previous paper! has indicated that there are differences in vocational 
interest, significant at the one per cent level, between students entering 
engineering studies and those entering non-engineering studies. Simi- 
larly, significant differences are also found between students entering 
different fields of specialization in engineering. These differences are 
important in the guidance of the student, for it is a basic assumption in 
the measurement of interests that where ability, opportunity, and effort 
are equal, the individual will achieve the greatest satisfaction and success 
in the area where he has the greatest interest. 

Our studies of over 1,000 freshmen at Illinois Institute of Technology 
indicate characteristic and significant profiles for the various groups. 
The Fire Protection Engineering freshman, however, has a profile which 
differs from that of other groups primarily because there is no area in 
which marked interest is exhibited by the group. The profile for the Fire 
Protection Engineering group is a rather flat one, whereas other engineer- 
ing groups, and the non-engineering groups such as in Architecture, have 
profiles with definite peaks and valleys, indicating strong and character- 
istic interests. 

* Read at the 20th Annual Meeting of the Midwestern Psychological Association, 
St. Paul, Minnesota, May 8, 1948. M 
1 бреет, G.S. The vocational interests of engineering and non-engineering students. 
J. of Psychol., 1948, 25, 357-363. 
521 


522 George S. Speer 


As we know that graduates of this department enter such different 
types of work as sales and engineering, it may be presumed that the 
students enter the department with different goals in mind. Obviously, 
such students would be likely to have more varied interests than students 
entering a chemical engineering curriculum. Marked interests of one 
subgroup might be expected to cancel the lack of interest by another sub- 
group so that the mean scores of the total group could be expected to 
approach the average on the scale. \ 

In order to investigate the utility of the Preference Record for Fire 
Protection Engineering students it was felt that the study should be 
extended to graduates of the department to determine: (1) the kind of 
work engaged in following graduation, and (2) the interest patterns of the 
graduates. 


Table 1 
Percentile Rank of Mean Kuder Raw Scores of FPE 
Freshmen and Alumni 

а а ш шшш И ——— мина 

ЕРЕ ЕРЕ ЕРЕ ЕРЕ ЕРЕ ЕРЕ 

Freshmen Alumni Alumni* Sales Eng. Adm. 

ШИШИ И ооо 15 ооо инче” MM SK 
Mechanical 56 64 45 59 75 69 
Computational 65 56 7 53 55 65 
Scientific 55 49 27 49 56 47 
Persuasive 65 88 80 95 77 77 
Artistic 50 50 45 54 39 49 
Literary 60 55 78 55 53 61 
Musical 47 43 62 40 47 43 
Social Service 43 53 56 54 64 41 
Clerical 39 26 37 23 20 30 
N 34 118 14 52 36 30 


аа З ААУ d Vu ls sul TL CARMEL LL 


* Fire Protection Alumni who have left the field of Fire Protection. 


With the cooperation of Professor Ahern, Chairman of the Depart- 
ment of Fire Protection Engineering, the Preference Record was sent to 
198 alumni of the Fire Protection department, who had indicated an 
interest in the study. Returns were received from 177 of these, and 
their results may be compared with those of the students. The equivalent 
Kuder percentiles of the mean raw scores of these two groups are shown 
in Table 1. 

When we compare the scores of the Fire Protection students with the 
scores of the alumni still in Fire Protection work, certain differences in 


2 Guilford and Martin's Personnel Inventory I was also administered to the alumni, 
but was not given to the students. No appreciable differences were found between the 
sales, engineering, or administrator groups on any of the scales of Objectivity, Agreeable- 
ness, and Cooperation. All three of the groups had mean C-scores of 6 on Objectivity 
and Cooperation, and 5 on Agreeableness, 


Kuder Interest Test Patterns 523 


interest become apparent. These data are given in Table2. The alumni 
show a significantly greater interest in persuasive and social service 
activities, and significantly less interest than the students in clerical 
activities. Other differences shown in Table 2 are not statistically 
significant. 

A small number of the alumni who returned the questionnaire in- 
dicated that they have left the field of Fire Protection work, and have 
entered other types of activity such as banking, merchandising, and 
personnel. Although this number is small, only fourteen, the interest 
patterns of this group were compared with the alumni still active in Fire 
Protection. The scores of the two groups, shown in Table 1, indicate 
very striking differences in preference. Those alumni who are no longer 
in Fire Protection work exhibit much greater interest in computational, 


Table 2 
Raw Kuder Scores of FPE Students and Alumni 


Freshmen Alumni 

Mean ом Меап ом Юм-м D/ep* 
Mechanical 830 3.25 865 1.48 3.5 m 
Computational 38.4 1.96 35.9 1.07 2.5 = 
Scientific 70.5 2.10 683 11.25 2.2 ке 
Persuasive 712 3.05 846 184 134 3.76 
Artistic 468 2.10 465 176 3 == 
Literary 499 271 419 161, 20 — 
Musical 16.0 1.32 15.3 81 4i RU 
Social Service 587 163 654  L51 6.6 30 
Clerical 503 165 435 143 6.8 31 

N 34 118 


ee at АВД 


* Only ratios of 3.0 or larger are shown. 


literary, musical, and clerical activities, and much less interest in mechani- 
cal and scientific activities than those still in the field. Similar differences 
are found when this group is compared with the freshmen students, as 
shown in Table 1. The alumni no longer in Fire Protection work express 
less interest in mechanical and scientific activities and more interest in 
computational, literary, and musical activities. ‘They also express more 
interest in persuasive and social service activities than do the freshmen. 
The alumni who are now in Fire Protection Engineering work also 
indicated whether their work is primarily sales, engineering, or adminis- 
trative in nature. The scores of these three subgroups are also shown in 
Table 1. | 
Although all three of the alumni groups are higher in persuasive 
interest than the student group, the salesmen are much higher on this 


524 George S. Speer 


scale than the other two. Both the sales and the engineering alumni 
groups have higher social service interests than the students or the ad- 
ministrators, and also lower computational and clerical scores. In me- 
chanical interest the sales alumni do not differ significantly from the 
students, although the administrators are considerably higher, and the 
engineer group is still higher. 

In short, the average freshman student in Fire Protection Engineering _ 
is less scientific in interest than other student engineering groups at 
Illinois Institute of Technology, and exhibits no really strong preferences 
for any of the areas measured, though his clearest interest is in working 
with figures and with people. The alumnus who is now in sales work is 
no more scientific in interest than the student, but has a much greater 
interest in working with people, and a somewhat greater interest in work 
which is of benefit to others. The alumnus now in engineering work has 
а greater interest than the students in working with people, and a stronger 
interest in work which will be of benefit to others. Although apparently 
no more scientific in interest than either of the other alumni groups or the 
students, he has a stronger interest in work which concerns machines and 
mechanical activities. 

The alumni administrator group differs the least from the average 
Fire Protection Engineering freshman, showing only somewhat higher 
interest in mechanical and persuasive activities. 

In attempting to evaluate these results in order to use them more 
effectively in the guidance of students, there arises the problem of deter- 
mining whether thé alumni entered sales or engineering because the 
individuals were the kinds of persons indicated by these scores, or whether 
they developed these interest as a result of the type of work which they 
were doing. 

Although the groups are too small to permit definite conclusions, we 
felt that some evidence on this point might be obtained by comparing the 
scores of the alumni who had been employed for varying periods of time. 
Here it was necessary to depend upon the date of graduation as an indi- 
cation of the length of employment in these fields. This is probably 
sufficiently accurate for the sales and engineering groups, but is probably 
not accurate for the administrator's group, as most people do not obtain 
managerial responsibilities until after a period of employment in other 
capacities. 

For each of the alumni groups, the scores were arranged in order of 
the date of graduation, and arbitrarily classified by four dates: those who. 
had graduated between 1910 and 1919, 1920 and 1929, 1930 and 1939, 
and since 1940. Ideally, such data should represent a series of measures 
obtained from the same persons over a period of time, rather than the 


Kuder Interest Test Patterns 525 


cross-sectional approach used here. The latter method, though less 
desirable, is used of necessity. 

The artistic and musical scores on the Preference Record are relatively 
constant for the three alumni groups, and do not differ greatly from the 


alumni show little change, but the sales group exhibits a marked decrease 
in literary interest for the older alumni. ‘The computational and clerical 
interests differ from the scores of the students, as shown above, but with 


groups. An exception is the rating on clerical interest of the alumni now 


in sales work where there is found a steady increase in clerical interest the 
longer the individual has been a graduate. 


Date of Graduation and Kuder Percentiles of FPE Alumni in Sales, 


ee ee ren ere 
Activity Graduation Mec. Comp. Sci. Pers. Art. Lit. Mus. Soc. Cler. 
Sales 1920-29 51 


53 
1980-39 80. 56. 52^ 6 58 8 пл о» 
1940-46. п 5 81 OF 85» 80 5 M 

Engineering 1920-20 88 36 65 78 70 7 0 7 2 
1930-39 70 75 58 65 30 55 55 75 20 
1040-40 65; 55 4T. 85. 42.180 492 8 2 

Administrative 1910-19 50 88 59 52 зв 67 42 39 % 
1920-99 65 65 43 72 6 67 55 35 85 
1930-29 78 67 50 90 6l и 42 39 45 


The differences in score on mechanical interest are shown in Table 3. 
The sales group shows only minor and unimportant differences in this 
score, but both the administrators and the engineers exhibit definite 
changes. The engineers indicate a marked increase in mechanical inter- 
est, while the administrators show an equally sharp decrease in’ such 
interests. 

The engineers show а steady increase in interest in scientific activities. 
The differences found for the administrators do not reflect a consistent 
change. The scores of the sales groups, however, seem to indicate some 
trend toward an increase in scientific interest. 

In all three of the age groups the salesmen have exceptionally high 
persuasive scores. The engineers begin with lower scores than either the 
salesmen or administrators, and show a declining trend the longer they 


526 George S. Speer 


have been employed in that occupation. The administrators show a 
sharp and steady decrease in such interests. 

In social service interest, the administrators are low at all age levels, 
but both salesmen and engineers show a steady increase, with the rate of 
increase somewhat sharper for the salesmen than for the engineers. 

To summarize, it seems probable that the freshman students in Fire 
Protection Engineering are a more heterogeneous group than the students 
in other departments. Those students who continue in Fire Protection 
work tend to enter one of two rather different types of activity, sales or 
engineering, although the sales work in this field is frequently closely 
related to engineering. About one-third of them may look forward to 
administrative responsibilities beginning ten years after graduation. 

Those with very high persuasive and very low scientific and social 
service scores tend to enter sales; those with high mechanical, scientific, 
and social service scores, and low persuasive scores tend to enter engineer- 
ing activities. Those with high computational, literary, musical, and 
clerical interests and low mechanical and scientific interests tend to leave 
the Fire Protection field to enter other occupations. From these results 
it also appears that those who later achieve administrative responsibility 
quite quickly approach the average in each of the interest areas measured. 

Recognizing again that the results must not be considered too definite 
because of the small numbers involved, it may nevertheless be possible to 
draw some tentative conclusions from these data. We conclude, then, 
that for guidance purposes the Preference Record is a useful instrument. 

A second conclusion is that the Preference Record appears to be 
sensitive to life experiences of the individual, so that the interpretation 
of scores must consider both his present stage of development, and a 
static job profile, in relation to possible changes in the interest pattern. 
Received June 16, 1948. 


Certain Factors Bearing on the Cleeton 
Vocational Interest Inventory 


Lester Nicholas Recktenwald 
Archdiocesan Veterans’ Guidance Center, Inc., New York City 


This paper treats certain factors which ought to be taken into con- 
sideration in any future evaluation of the Cleeton Vocational Interest 
Inventory. 

Procedure 


The inventory was administered to 166 twelfth grade high school 
boys at the beginning of a semester. One of the directions for marking 
the 630 items requires that the subject place a plus sign beside the item 
he likes or would answer “yes” and a zero sign beside the item he dislikes 
or would answer “no.” Fifteen weeks later, the inventory was again 
administered to the boys. From the original responses and the digres- 
sions from them on the second administration, certain data were obtained. 


Findings 
1. The number of pluses and the number of zeros marked on the first 


administration tended to balance. The following totals for each type of 
response were obtained. 


It appears from the above figures that about equal opportunity exists . 
for plus and zero responses to be made. Whether this is a chance or fore- 
seen phenomenon peculiar to the instrument is not known. No doubt, 
it is a desirable characteristic in that it tends to allow a reasonable num- 
ber of both types of responses in any given situation. Moreover, it is 
unlikely that any given individual will make only one type of response. 

2. Five one-hundredths (0.05) per cent of all responses by all the boys on 
the first administration were left blank (from preceding figures). It would 
! Cleeton, Glen U., The Cleeton Vocati Interest Inventory, Men, McKnight and 
McKnight, Bloomington, Illinois, 1987; revised edition, 1943. This study is based on 
the original edition which differs from the revised in internal construction in that it con- 
tains one less category of “vocational interests.” It is not concerned with the SR 
section which has been omitted in the revision. j 

527 


528 Lester Nicholas Recktenwald 


appear, therefore, that the requirement for only plus or zero responses is 
justifiable if the criterion of readiness to respond either positively or 
negatively is used. However, it must be stated in this connection that 
instructions were expressly given that either response was to be used. 
If a third alternative had been suggested, the figures might have been 
quite different. Instances of complaints did occur because certain boys 
thought that a mark for an indifferent attitude should be allowed. 

3. A 21.4% change occurred in the responses of the boys after 16 weeks. 
Cleeton? has reported a 6.1% change by unspecified individuals within a 
month. As will be observed, the results of the present investigation are 
far greater than this figure although, of course, the time lapse between 
administrations is greater in the present study. 

In individual cases, high percentage change of all responses did not 
necessarily change the preferred category? of “vocational interests.” 
One case is here cited. 

Case X 


Total change was 35%. Point score in the preferred category, me- 
chanic (MEG), initially was 48; point score rose to 49 in the same category 
which was also the preferred one on the second administration. Mechan- 
ical interests, as estimated by the inventory, persisted though percent- 
age change of responses to the 630 items was great. 


Table 1 
Number and Per Cent of Cases Preferring Certain Categories 


—— 


First Administration Second Administration 
No. % No. % 
Mechanie (MEG) 75 45 75 45 
Engineer (EFC) gum a 49 124 30/75 
АП others (7 39 24 42 5 
Categories) 
Totals 166 100 * 166 100 


In the less obvious cases, it would have taken an experienced analyst 
to detect a similarity of interest patterns, if similarity existed,* on both 
administrations. By and large, however, the preferred category waS 
affected when large percentage change of total responses occurred. 


*Cleeton, Glen U. Manual of Directions for Vocational Interest Inventory. Мо 
Knight and McKnight, Bloomington, Illinois, 1937, p. 21. E 

з “Preferred category” in this and subsequent references means that category 12 
which the subject had made his highest point score. It does not refer to highest рег" 
centile rank which has been assigned to norms in the later edition. 

4 The question whether interest patterns actually changed is beyond the scope of 
this paper. 


Factors in Cleeton Vocational Interest Inventory 529 


4. About three-fourths of the preferred categories were either the mechanic 
(MEG) or engineer (EF Су on both administrations. 

From the data in Table 1, it appears that the inventory had segregated 
the extreme likes of these boys consistently as a group. What the results 
would be in another locality where the vocational make-up of the com- 
munity would be quite different is, of course, not known nor can they be 
ascertained from the norms furnished in the original or revised manual. 

5. When a change of preferred category took place, the mechanic (MEG) 
and the engineer (EFC) interchanged more readily than any other. 


Table 2 
Number and Per Cent of Cases Retaining and Shifting Preferred Category 


Second Administration 

eerie E ва SS Б ай 

EFC MEG All Others 

No. % No. % No. % No. % 

уа EFC 52 100 36 70 10 19 6 1 

445 MEG 75 100 12 16 59 79 4 5 

2AE АШ 39 10 з 8 5 13 зї 79 
E ^7 Others 


р, лс ече» з числ Ке сомкесин ыыы 


The data in Table 2 should be read as follows: Of 52 boys initially 
preferring the engineer (EFC) category, 36 or 70% retained that category 
as the preferred one on the second administration; 10 or 19% shifted to 
the mechanie (MEG); and 6 or 1176 shifted to other categories. It 
should be stated that the number of cases preferring other than the two 
categories named was insufficient from which to draw any conclusion. 


'Tabel 3 


Change of Response to Initially Liked Occupation Occurring 
in Selected Combinations of Items 


No. of No. % 

Combination Pluses Changed Changed 
Three Most Liked 
Occupations in High 498 30 6.02 
Category 
Other Liked Occupa- 
tions in High Category 1591 371 23.32 
Liked Occupations in 
Other Seven Categories 7766 2578 33.19 
(excluding high and low) 
Liked Occupations in 
Low Category 590 256 43.49 


_ ee 


530 Lester Nicholas Recktenwald 


6. More reliance can be placed on the three most liked occupations in @ 
highest scoring category than on any other combination of liked ссираћ 
analyzed from the point of view of constancy of response. $ 

From the figures in Table 3 one notes а progressive increase in р 

centage change. Within the limitations imposed by the grouping ОЁ 
middle seven categories, we can say that the classification of oceupat 
items in the respective categories appears justified on the basis of d 
of change to ђе expected. From the point of view of the counselor, me 

, reliance can be placed on the three most liked occupations in the hig) 
scoring category than on any other combination of liked occup 


Limitations 
The results involving change of responses after fifteen weeks may b 
somewhat colored by the fact that some of the boys investigated infori 
tion about certain occupations listed in the inventory. Certain re- 
sponses on the second administration were affected by such study but the 
exposure to such information, under the circumstances existing in thi 
investigation, would hardly influence the larger results encompassed й 
this paper to a marked degree. 
Another factor limiting the study is that point score rather th 
percentile rank of score was used in isolating the preferred categ 
The 1943 norms show that the two indices do approximate each 
but undoubtedly the latter is to be preferred. 


Summary 

The data presented show both favorable and unfavorable ph: 
in connection with the use of the Cleeton Vocational Interest. Inven 
with twelfth grade boys. From the responses on the first administration, 
the data appear favorable as to the scope of the items included in the 
inventory. The requirement that only plus or zero responses be mad 
did not always satisfy the subjects used in this study although only 8 
negligible percentage of the items were left blank. From the fi 
involving change after fifteen weeks, results are somewhat mi 
Certainly a change of about one-fifth of the responses leaves much to 
desired as to the stability of attitudes toward the items in an inventor) 
purporting to “measure vocational interests.” At the same time, 18 
change of all items does not appear in all cases to change the prefe 
category? Where it did, an experienced analyst might have been able 
detect similarity of interests, if similarity existed, on both admin 
tions. The inventory seemed to segregate the extreme likes rather 
sistently but the evidence in this study permits this to be said only 


Factors in Cleeton Vocational Interest Insentory 531 


the mechanic (MEG) and engineer (EFC) categories. Considerable 
/— geliance can be placed on the most liked occupations in the preferred cate- 

gory from the standpoint of the likelihood that plus responses will occur 

again on а subsequent administration. 

The above conclusions must necessarily be viewed within the con- 
ditions and limitations of the study. 


Received February 19, 1948. 


A Study of Stage Fright and the Judgment of Speaking 


Ernest H. Henrikson 
University of Minnesota 


This study had its origin in the statement often made by students! 
time seems so long when they are speaking, especially if they are som 
what afraid; that a brief pause seems minutes long, etc. The ритром 
this study was to see whether there is any significant difference betwe 
the judgment of elapsed time when a person is speaking and when 
not and if the judgment of length of speaking time is influenced by 
degree of stage fright. 


Procedure 


Part A: To check on whether or not college students think there is 
correlation between degree of stage fright and judgment of speaking time, 
seventy-five of them were asked to select the one of the three follo 
statements which, in their opinion, was closest to correct. 8 
person is speaking А 

1. The more afraid ће їз, the longer his speaking time will seem 

to him. 

2. His fear will have no influence on how long the time he speaks 

seems to him. 

3. The more afraid he is, the shorter his speaking time will seem ~ 

to him.” 
On another day, the same students were asked similar questions with th 
substitution of “less afraid" for “more afraid" in questions 1 and 3 am 
with the order of presentation reversed. be 

Part B: One hundred and ten students in six sections of Fundamenta 
Speech, not including any of those who took part in Part A, gu 
length of a period of time during which they sat doing nothing. 
time was changed for each section (40, 50, 60, 75, 80 and 90 second 
no group would profit from the experience of any other group. 
each student gave an impromptu speech on a subject which was һ 
him typed on a card and left face down until the speaker preceding 
began fo talk. The speech subjects were statements from 4 

* The original data for this study were collected at the University of C 


Analysis of these data was a part of the research program of the Speech Clinic ( 
the Dean of Students) of the University of Minnesota. 


532 


Stage Fright and Judgment of Speaking Тете 533 


Fables, such as “Look before you leap," “A stitch in time saves nine," 
etc. After each speaker had talked, he guessed his speaking time in 
seconds. He also indicated his stage fright on a scale from 1, indicating 
no stage fright, through 10 indicating very great stage fright.' 

The experimenter kept a record of the actual speaking time, measur- 
ing it with a stop watch. A comparison of the amount of error made in 
judging the speaking and the non-active periods was made. Since the 
number of persons in some of the ten levels of stage fright was small and 
since the purpose of this study was to study a trend, a grouping was made 
on the basis of extremes. This was done as follows: Group 1: Below 
average stage fright (those who checked 1 through 4); Group II: Average 
fright (those who checked 5 through 6); and Group III: Above average 
fright (those who checked 7 through 10). 


Results 


Most students (95%) believe that there is a positive relationship be- 
tween the intensity of a person’s stage fright and the length of time he 
will believe has elapsed during a speech he has given. The more afraid 
he is, the longer the time will beem to him; the less afraid he is, the shorter 
the time will seem to him. Table 1 reveals this situation and also shows 


Table 1 
Students’ Responses to Stage Fright Questionnaire 
a ooo 
Second Presentation 
КИТЕ PO e es ee 
Response* 1 2 3 Total Per Cent 
e UOO ee, 
First 1 71 71 94.6 
Presentation 5 2 2 2.7 
3 2 2 2.7 
Total 7" 2 2 75 
Per cent 94.6 27 100 


* Response 1 = The more afraid а person is, the longer his speaking time will seem; 
2 = Fear has no influence on how long speaking time will seem; and 3 = The more afraid 
8 person is, the shorter his speaking time will seem. 
the consistency with which students make these judgments in responding 
оп two different days to the two forms of the questionnaire. | 

Table 2 indicates that the students in this experiment tend to judge 
a “non-active” period as longer, rather than shorter, than it és and, in 
contrast, they tend to judge their speaking period as shorter, rather than 
longer, than it is. Sixty of the 110 students overestimated the length of 


‘Henrikson, E. Н. Some effects on stage fright of a course in speech. Quart. J. 
Speech, 1943, 29, 490-491. 


534 Ernest Н. Henrikson 


Table 2 
Students’ Error in Time Judgments 
Number Misjudging Number M isudging 
Inactive Speaking Time 

Per Cent Over- Under- Over- Under- 
of Error estimate estimate Total estimate estimate Total 
130.01-140 2 0 2 0 0 0 
120.01-130 0 0 0 0 0 0 
110.01-120 3 0 3 0 0 0 
100.01-110 0 0 0 0 0 0 
9001-100 3 0 3 2 0 2 
80.01- 90 1 0 1 2 4 6 
70.01- 80 4 0 4 2 6 8 
60.01 -70 3 2 5 0 8 8 
50.01 –60 3 + 7 1 10 11 
40.01 -50 6 2 8 1 17 18 
30.01 –40 7 3 10 2 15 17 
20.01 -30 11 11 22 0 9 9 
10.01 -20 14 11 25 9 12 21 
01 -10 3 10 13 3 5 8 
No error 7 2 
Total 60 43 110 86 110 


the inactive period but only 22 of the 110 students overestimated the 
length of the speaking time. 

The students’ error in judging their speaking time (mean 39.0% + 
7.7%) is slightly larger than their error in judging the non-active period 
(mean 36.1% + 8.4%) but the difference is not statistically significant 
(C.R. = .715) (see Table 3). The percentage of error in judging the 
length of a period of time, whether non-active or speaking, is not influenced 
significantly by degree of stage fright. Judging speaking time, those 
least afraid (Group 1) made a mean error of 38.5% + 21.0% while those 
most afraid (Group 3) made a mean error of 36.2% + 12.9% (О.В. = .4). 
Judging the inactive period, the same groups made mean errors of 35.87 
= 44.0% and 35.1% + 16.% with a C.R. of .09. Neither difference is 
significant. 

More students (27%) of the group with little stage fright (Group 1) 
tend to overestimate the length of their speaking time than do those with 
greater stage fright (18%) (Group 3). This difference is not statistically 
significant (С.В. = .87). Thus, the experimental findings are not in 

accord with the students’ expressed beliefs. 


i 
| 
i= 
: 
5 
i 


GC - = « - риме c ——————————————— Á———————— Ó 


%9 250€ %99 %r8 FAT WTI гш pa DLL FANGE оп ees 
01-L 
2501 2582 2509 250'91-F25T'c8 %% 958L 2581 ?56'cUF ATIE og "Ay өлодү 
(9-9) 
250 251* 969 092+ 250'8£ %0 %©8 %вт AWL EIF AEF ve әЗздләлү 
G-D 
%8 %8g %% 0777 ASSL %0 MeL 2516 250 167r 25€'8£ 9% “Ay Moped 
эю Sunway Эшулшу —uonwums ur чәр өзә Запа Zurean uonwumsq  · 89680 yqaug 
-SO19puf) «= -So19A() зо "rg pus зош -дод  -sSo10puf) -8919A() ur овој jo Jo 92839 jo 
E reines JO 119) 194 чвор — rg риє 10119 30 sq әәлйә( JO 
syuəpnyg jo 3090) 194 sjuopnig Jo 1090) 19d 1190 194 те Bugey чао 
pouoq оаповиј jo упошарпј oun, Juryvedg jo yuouspne 


— M —MMM———————— 
quu adag jo әәл®әсү пә, оў PVY =з әш, oAnovug pus oun], 3upjsodg jo зчәшӣрпр ,syuepnig jo човиза од) 
€ әдет, 


536 Ernest Н. Henrikson 


Summary 
Under the conditions described and within the limits indicated, most 
students (95%) believe that the more afraid a student is, the longer his 
speaking time will seem to him. The experimental results, however, 
indicate that persons of all degrees of stage fright may make errors in 
judging a period of time, whether they make the judgment while they are 
speaking or while they are sitting doing nothing. But there is no signif- 
' icant tendency for degree of stage fright to correlate positively with an 
estimation of speaking time, as the students in this study thought would 
be the case. The tendencies which emerged were actually in the opposite 
direction. 
Received February 18, 1948. 
] 


Cumulative Effect of Marginal Conditions Upon 
Rate of Perception in Reading * 


Miles A. Tinker 
University of Minnesota 


Various environmental factors may influence speed of perception in 
reading. Some typographical arrangements foster а rapid rate of per- 
сербоп, other arrangements retard the rate. Illumination is another 
important factor. The intensity of illumination must be above a certain 
level or speed of perception in reading will be retarded. А marginal con- 
dition is here defined as one in which the rate of perception in reading is 
not significantly retarded, but would become so with further changes of 
the condition in an adverse direction. If several of these marginal con- 
ditions are present, do the effects cumulate to produce a significant re- 
tardation in rate of perception in reading? 

Paterson and Tinker (1) show that italic printing retards speed of 
reading by 2.7 per cent. This is not a statistically significant difference. 
They also report (page $1) that 8 point type is read 3.4 per cent slower 
than an optimal size (10 point). This also is not a significant difference. 
Tinker (2) found that, with adequate adaptation of the eyes, illumination 
intensities below about 3 foot-candles retarded speed of perception in 
reading 10 point type. The rate did not change for 3 foot-candles and 
above. Apparently the critical or marginal level of illumination for 
reading 10 point type is approximately 3 foot-candles. 

It is of theoretical and practical interest to know the result on speed 
of perception in reading of combining these three marginal conditions, 
i.e., 8 point type with italic printing to be read under 3 foot-candles of 
light. The problem of this study, therefore, is to investigate the effect 
upon speed of perception in reading of combining three marginal condi- 
tions: illumination intensity, type form, and type size. 


Materials and Procedure 


The reading material consisted of Forms I and II of Tinker's Speed of 
Reading Test. In each form there are 450 paragraphs (items) of 30 
Words each. As a check on comprehension, the reader crosses out the 
* The writer is grateful to the Graduate School, University of Minnesota, for research 


Erant to finance this study. AR 
1 This test is not yet available for general distribution. 
537 


538 Miles A. Tinker 


one word that spoils the meaning in each item. The two forms are 
approximately equivalant. 

The experiment was conducted in a light laboratory. The illumina- 
tion was indirect and thus well diffused. "There were two groups of sub- 
jects: the control and the experimental group. For the most part, two 
subjects were tested at a time: the first two were in the control group, the 
next two in the experimental group, and so on. There were 83 subjects - 
in each group. 

With the control group, the procedure on arriving at the laboratory - 
was as follows: a light intensity of 25 foot-candles was provided. The 
subject was adapted to the light for 3 minutes while a preliminary drill 
exercise was given to familiarize him with the test procedure. Then the 
standard set-up of Form I of the test was given with a 10 minute time 
limit. 'The standard set-up was 10 point roman, Excelsior type face, 
set with two points of leading in a 20 pica line width on eggshell paper 
stock. Then after a 15 minute rest, Form II was given with a 10 minute 
time limit. Typographically, Form II was the same, as Form I in this 
(control) group. Use of the control group makes it possible to correct 
the means in the experimental group for amount of inequality in difficulty 
between Forms I and II. 

Testing procedure for the experimental group was as follows: on 
arrival at the laboratory the subjects were adapted for 3 minutes to the 
25 foot-candles of illumination and given the practice exercise. They 
were then given the standard set-up of Form I, just as with the control 
group. The subjects were then adapted to 3 foot-candles of light for 15 
minutes while resting. Then they were given Form II of the reading test 
set-up as follows: 8 point italic, Excelsior type face, one point leading in 
& 12 pica line width on eggshell paper stock. One point leading is ade- 
quate for 8 point in this line width (1). The time limit was ten minutes 
as in all the other testing. In every instance the subjects were instructed | 
that it was a test for speed and accuracy, and to work rapidly but not 
make mistakes. There was, therefore, a strong emphasis upon speed of 
response. Some unpublished work done by the writer indicates that the | 
‘speed attitude in reading can be maintained at a high and constant level 
for a 10-minute interval. 


Results and Discussion 


The data of this experiment are given in Table 1. Test Group I is 
the control and Group II the experimental group. In column 8 is the 
correlation between Forms I and II. These coefficients of .94 and .95 
indicate very high equivalant form reliability for the reading tests. In 
* Test Group I, Form II, which is identical typographically with Form 1 
and which was read under the same illumination as Form I, was read 


Effect of Marginal Conditions on Reading 539 


Table 1 
Cumulative Effect of Marginal Conditions Upon Speed 
of Perception in Reading 
e 
Differences Between 
Means in 

Mean: — D 
Test «Test Formand Foot- Para- Para- Per — 
Group Type Size Candles graphs ВЕм graphs* Cent r SEprrr 
us fot REUS E 25° ИПА E весе щш ш; 

(1) (2) (3) 4 (5 (6) 7 ec (Q9 


118.75 240 


25 

IL Roman 10Pt. 25 11757 274 990 00 6 00 
25 
3 


П І, Roman 10 Pt. 118.53 249 1929 - 104 94 1443 


105.00 2.34 


* The differences in column 6 are "corrected" by the amount of the difference be- 
tween the mean scores of Form I and Form II of Test Group I which serves as а control 
group. The “correction” amounts to 1.18 paragraphs. 


1.18 paragraphs slower than Form I. This serves as à “correction” for 
the differences listed in column 6 of the table. Incidentally, the differ- 
ence of 1.18 is not statistically significant. In Group Il is shown the 
combined effect of the marginal conditions (8 point italic type read under 
3 foot-candles of light) in comparison with the standard arrangement 
(10 point roman type read under 25 foot-candles of light.) The speed 
of perception in reading was much slower with the marginal conditions 
operating together. Thus the 8 point italic with 3 foot-candles of light 
was read 12.29 paragraphs slower than the standard set-up. In other 
words, 369 fewer words were read in the 10 minute period. This amounts 
to a difference of 10.4 per cent. The critical ratio in column 9 reveals 
that this difference is significant beyond the one per cent level. 

Apparently when these three marginal conditions are all present at 
the same time they produce a non-optimal visual situation which is not 
produced by any one of the three alone. Thus neither 8 point type, italic 
print nor 3 foot-candles illumination significantly retard speed of per- 
. сербоп in reading? But in combination they produce a retardation of 
over 10 per cent. This amount of retardation is only achieved by marked 
degrees of non-optimal printing arrangements or by reducing illumination 
to less than one-tenth of a foot-candle. "us 

Тће implications of these finding for the hygiene of vision in the 
reading situation are clear. It is not safe to use the critical level of 


2 These results were obtained from reading periods of 194 minutes. There is a 
Possibility that a longer reading period might have produced a significant difference. 
It is doubtful, however, that such a difference would be as large as 10 per cent. 


540 Miles A. Tinker 


illumination in visual situations where the details to be discriminated are 
also marginal. A combination of marginal conditions such as used here 
work together to produce markedly non-optimal visual performance. 
This has been shown to be true for reading 8 point italic print under 3 
foot-candles of light. It remains to be seen whether similar results occur 
with other combinations. 

These results emphasize a point stressed previously by the ‘writer. 
Critical (marginal) levels of illumination should not be employed for 
visual tasks. The intensity of light should be enough greater than the 
critical level to assure an adequate margin of safety, i.e., so that the visual 
task may always be done without loss of efficiency and with comfort. 


Summary 


1. The purpose of this study was to investigate the effev’ spon speed 
of perception in reading of combining three marginal conditio: 5: ilumin- 
ation intensity, type form and type size. 

2. Eight point italic type was read under 3 foot-candles of light in 
comparison with reading 10 point roman type under 25 foot-candles. 
When employed as a single variable, neither 8 point type, italic type 
= nor reading under 3 foot-candles retards speed of reading signifi- 
cantly. 

3. The 8 point, italic print, read under 3 foot-candles retarded speed 
of perception in reading by 10.4 per cent in comparison with reading 10 
point Roman type under 25 foot-candles. 

4. Thus these three marginal conditions, when operating together, 
produce a markedly non-optimal visual task. 

5. To maintain a hygienic visual environment, therefore, it is import- 
ant not to employ marginal (critical) levels of illumination. An adequate 
margin of safety above the critical level is needed. 


Received February 25, 1948. 
References 


1. Paterson, D. G., and Tinker, M. A. How to make type readable. New York: Harper 
and Brothers, 1940 (obtainable from the authors). 

2. Tinker, M. A. The effect of illumination intensities upon speed of perception and 
upon fatigue in reading. J. Educ. Psychol., 1938, 30, 561-571. 


Similarities and Differences in College Populations 
on the Multiphasic * 


Hugh S. Brown 
Los Angeles State College, California 


In a routine testing program the Minnesota Multiphasic Personality 
Inventory (MMPI) was administered to 512 General College Freshmen 
at the University of Minnesota in the fall of 1945. In interpreting the 
scores or profiles of individuals in this population it was important to 
know whether this normal population of General College Freshmen should 
be considered as a sample of the original normal population on which the 
MMPI was standardized, or if it was a population which was significantly 
different from this original norm group. 

In order to determine if this significant difference existed, and also to 
determine if college populations differ significantly among themselves in 
their uncorrected K scores on the MMPI, the Means and 
Deviations of the General College population were compared with the 
Means and Standard Deviations of other College populations on which 
data were available. The group-profiles were compared to see if there 
Was any similarity of pattern in MMPI profiles in college groups. Fi- 
nally the chi-square test of significance was used to see if the number of 
General College students who had deviate scores on the MMPI was 
significantly different from the number expected by the authors of the 
MMPI in the general population. 


College Populations 


The following brief descriptions of the college populations, for which test 
Scores on the MMPI were available, indicate the different selection factors 


Normal." According to Meehl 0 and McKinley and 
of the 


S: i th scales for the MMPI 


Hathaway (6), int 


Tesponses, which would discriminate the normal person from the abnormal, 


or parent, io-economic status, intelligence, and education. 
his pes entes deg "tcoll normal." ]t was composed of 155 
males and 110 females, mainly pre-college high school graduates, who had 
* This paper is a revision of part of the writer's Ph.D. thesis. 
541 


542 Hugh S. Brown 


Table 1 
Means and Standard Deviations on MMPI Scales of College Populations* 
———— M ————— 
MMPI Scales 
College — 

Populations Hs D Hy Pd Mf Paty Pt Sc Ма 
Б ы ылыы ee ee 
Original X 47.44 45.60 51.56 44.76 Not 48.11 46.58 47.29 4854 
College с 588 885 3731 9.87 Given 7.36 8.08 7.48 10.05 
Маје (N=155) _ — 

College X 46.51 49.77 52.88 52.35 53.50 5280 46.05 45.68 56.36 
Industrial c not reported 

Male (N=66) . .— 

Torrens - X 43.73 49.98 53.56 48.20 55.06 51.33 46.29 45.85 49.71 
Students с 378 831 631 964 941 711 732 5.73 8.16 
Male (N=82)  _ 

General X 53.70 54.87 56.83 5487 56.12 53.00 5243 52.85 58.71 
College с 10.33 11.63 9.87 10.33 6.05 10.09 9.61 9.89 10,48 
Male (N=176) _ > 

Original X 45.41 45.80 49.87 45.68 Not 4946 44.86 47.48 48.04 
College с 848 811 9.86 835 Given 7.89 7.76 7.39 10.84 
Female (N=155) _ 

Loth Soph. X 4464 47.94 52.00 48.84 51.78 49.88 48.72 48.88 52.58 
Students с 568 867 696 977 947 831 836 795 932 
Female (N=110) — 

Lough X 47.10 4840 5270 49.70 50.30 52.20 48.70 5140 54.60 
Students с 798 942 905 116 9.25 869 921 9.27 1014 
Female (N=185) _ 

General X 46.69 49.93 5297 53.84 53.03 5223 4825 49.22 54.73 
College с 704 878 7.71 884 9.50 848 835 788 9.60 
Female (N =366) 


“Stee Seopa p bere TUR ect VICI ae ET REED 


* K correction was not applied (7, 8). 


come to the University Testing Bureau for pre-college guidance, plus à number 
of representatives from college classes, They represented an adolescent, un- 
married group, with a “preferred” rating as to intelligence, socio-economic 
status, and education. Since the builders of the MMPI were not intereste 
at the time in these ‘“‘college normals” other than as an aid in item selection, 
little further attention was given to the data on their scores, except as they were 
used to eliminate the items previously mentioned. , 1 

College Industrial. А group of 66 college graduates employed in an mM- 
dustrial concern. These were reported бов Dr. Starke R. Hathaway's files I 

Torrens Group (9). 82 junior medical students who had taken the M P 
while serving their neuropsychiatric clerkship at the University. These scores 
were made available by Br. Hathaway. 

Loth Group (4). 110 female students enrolled in the elementary рву“ 
chology laboratory course at the University of Minnesota in the Fall quarter 
ke Lough Group (5). 182 female Teachers College students in New York 


ate. 
‚ General College Population, University of Minnesota. The test was ee 
ministered to 176 males and 366 females who entered the General College in the 


Similarities and Differences on the Multiphasic 543 


fall quarter 1945. Six weeks had ela) since registration so the students had 
become oriented to college. The full-tide of veteran enrollment had not then 
reached the University as only 10% of the 1945 General College freshmen were 
veterans. 

In general, the student attends the General College, rather than some other 
college in the University of Minnesota, or elsewhere, because: (1) he has been 
unable to meet the scholastic requirements of another college; or (2) certain 
deficiencies in his high school record need to be made up; or (3) he wishes to 
secure a general education for two years. Transfer to another college at the 
end of two years is possible for some of the better students. Table 1 gives the 
Means and Standard Deviations of each population) The group profiles 
based on these data are shown in Figures 1 and 2. 


TRECE н D н, Po M Pa Рр & М. 


o 


in 


ШЕЕ p [ШИ ДЕНТ 
ШШШ БЕП О Н ШП ЕЩ Өй ЛЛ DN 
ҮТТЕ ЧА ҖЫ ie E 


Ела. 1. MMPI mean profiles for Male College populations. 


An examination of the mean scores and profiles indicates that although 
there are marked differences in elevation, there are certain similarities in 
the patterns of the Group profiles. The neurotic triad (Hs, D, Hy) in- 
dieates that, unlike the abnormal profile where D is usually the high 
point, these college profiles have Hy higher than Hs and D. On seven 
of the eight profiles Hs, D, and Pt are below the mean of the general 
population. The one exception is the General College male group. . It is 
the only group with the average score on the Hs scale above the mean of 
the general population. Since young people as a whole do not ordinarily 
show this hypochondriacal tendency, this General College male population 
may differ slightly, psychologically from the other groups studied. 

On only one profile is the Hy score below the mean of the general 
Population, but it is still the high point of the neurotic triad. On six 
profiles 8с and Pt are below the mean. On eight profiles Pa and Ma are 


1 The summaries of data are taken from Brown (1). 


544 Нидћ 8. Вгоит 
ө Ei H D hà M Р. Р 5, M 


T Score 
8 


Ес. 2. MMPI mean profiles for Female College populations. 


higher than Pt and Se. In general all profiles seem to have their high 
points at Hy and Ma. In addition female profiles appear to be high on 
Pa scale, and the male on Mf scale. 

In spite of this similarity of pattern, however, slight variations appear 
when we examine the actual scores of the different populations. Table 2 
indicates the difference between the General College population and the 
original “college normal” group. 


Table 2 
Differences in T-Score Units Between the Means of General College 
Populations and the Original College Group 


Hs D Hy Pd Mf Pa Pt Sc 


G.C. 5870 5487 56.83 54.87 5612 53.00 5243 5285 5811 
Orig. 4744 45.60 5156 44.76 a 4811 4658 4729 48.54 
НЕМА SU ИН Aea Ва 4811 ed ce 


Diff. 620* 927 5.27" 1011* a 549 585 5.56* 10.17° 


Female 

N=366 

G. C. 46.69 49.93 5297 53.34 53.08 52.28 48.25 4922 54.73 
* 

Diff. 1.28 ` 4138*  3.0*  7.06* а 2.278. 820° 12:76 6.69 


а. not given 
и с а у C00 8 UTR I EID ома E Ратне ране 


* difference significant at the 1% level. 


Similarities and Differences on the M ultiphasic 545 


‘The General College male population was significantly different 
b level) on all scales. The female group was significantly different 
al scales except Hs and Sc. 
he validating scores (?, L, F) were not compared. Because of the 
rbit: assignment of T-score values to these scales, the usual tests of 
je significance of the differences could not be employed. The percentage 
istribution for General College students on the ?, L, and F scales, as given 
n Table 3, would not seem to indicate any significant difference from the 
istribution expected by the authors of the MMPI (3, 8). 
Comparisons between other populations showed similar differences. 
ће female group reported by Loth was significantly different from the 
iginal college normal population on D, Pd, Pt, Ma scales. The female 
roup reported by Lough was significantly different from the Original 
Jollege Normal group on all scales except Hs and Pa. The General 


{ 4 Table 3 
Percentage Distribution of 542 General College Students 
on ?, L, and F Scales 


————á 


oe И F 
— 3. 27 
1 153 132 
99.9 817 841 


e female group was significantly different from the Lough female 
p on D, Pd, Mf and Бе scales. 
A comparison of the Male population showed the Torrens group of 
iunior medical students significantly different from the Original College 
mals on Hs, D, Pd scales. The College Industrial group was signifi- 
tly different from the Original college normals on D, Pd, Pa, and Ma. 
les. The General College males were signi cantly different from the 
Torrens male group on all scales except the Mf and Pa. The G. C. males 
were significantly different from the College Industrial group on all 
‘Seales except Ра and Ma. 
_ In a normal distribution of scores having a T-score mean of 50, ap- 
proximately 2% of the scores might be expected to be at 70 or higher. 
Because the items selected on the MMPI were largely marked in the ` 
orable direction, the distribution is slightly skewed and from three to 
percent of the scores of the general population exceed 70. 
АП differences mentioned as significant are at the 1% level. In computing the 
cance of the difference between groups, the homogeneity of variances was first 
cked (1% level) before deciding whether or not to pool them in computing the 
ard error of the difference. у ' 


ў 
E 
{ 


546 Hugh S. Brown 


When the percentage of scores of 70 and over, in the General College 
population, was compared with the maximum percentage expected in the 
general population, it was found that, in the female group, this 5 per cent 
figure was equalled on the Pd scale, and exceeded on the Mf scale (6%) 
and on the Ma scale (9%). The percentage frequency of these deviate 
scores among the male group exceeded the maximum 5% figure of the 
general normal population distribution on each scale.* The percentages 
were: 


Hs D Hy Pd Mf Pa Pt Sc Ma 
6.8 118 10.8 74 9.1 85 6.8 6.8 18.8 


Applying the chi-square test of significance it was found that the male 
General College population exceeded the maximum expected frequency 
at the 5% level of confidence on the D, Hy, Mf, Pa and Ma scales. The 
female group exceeded this level only on the Ma scale. 

Since chi-square is a test of significance and not a measure of the 
degree or magnitude of the relationship, it is worthwhile to call attention 
to the fact that, on the scales on which the General College populations 
were significantly different from the college normal group, the actual 
percentage frequency occurrence of deviate scores was from two to four 
times as great as the maximum expected by the authors of the test in the 
general population. 

On the basis of comparisons just made it would seem that, while there | 
is а similarity in the pattern of the profile of the various college groups, 
there are significant differences not only between the scores of the General 
College populations and the original college normal population, but also 
among the several college populations on which MMPI data were avail- 
able. In addition, the frequency with which deviate scores of 70+ on 
the MMPI occurred in the General College population was considerably 
greater on many scales than would be expected by the authors of the 
test in a normal population. Since these mainly non-pathological popu- 
lations can hardly be regarded*as other than populations which differ 
significantly in their scores on the MMPI, it would seem that no one of 
them can be taken as representative of college students in this respect. 
Thus a valid interpretation of an individual profile on the MMPI would 
seem to involve a knowledge of whether the group of which the individual 
is a member differs significantly from the original “college normal” 
population used in the construction of the MMPI, and whether or not this 
difference is of psychological significance. 


‚+ The work of Darley, Williams et al. (10) indicated that the General College popula- 
tion with which they were dealing was as well adjusted as any college group. The 
elevated MMPI profiles of the present General College group do not seem to correspond 
with this finding. A later article will deal with this. 


Similarities and Differences on the Multiphasic 547 


It would be unwise to attribute psychological significance to differ- 
ences which are only statistically significant. Without clinical evidence 
it would appear to be an equally grave error to ascribe psychological 
significance to scores classed as deviate with reference to the original 
norm group, which are not deviate scores when referred to the mean of 
the population with which we are working. In the General College male 
population, for example, a T-score two standard deviations above the 
mean of the group on the Ma scale would be 80.67. Clinical evidence 
alone can decide whether a T-score of 70 or a T-score of 80 on the Ma 
scale should be taken as “borderline” in this population. 

Personality tests are essentially an expression by the subject of his 
opinion or feeling on certain questions of a personal nature. Meehl and 
Hathaway (8) feel that college persons in particular tend to be on the 
defensive, consciously or unconsciously, in answering these questions, and 
developed the K scale on the MMPI to measure this tendency. They 
found that college persons tend to be differentiated from others on this 
scale but felt that the factors of age, intelligence, and the mere fact of 
being in college could be eliminated as the chief factors in this differenti- 
ation. They concluded that socio-economic status is the most plausible 
remaining variable which might account for the differentiation. | 

It would appear that the manner in which the items in a personality 
test are answered is not a function of the test situation, as in so many 
achievement and intelligence tests, but is rather a function of some other 
part of the subject’s life. Cultural and environmental factors could influ- 
ence the opinions of individuals on these questions to such an extent that 
it would be difficult to pronounce any one answer as normal or average 
for all populations. 

Inspection of items selected from the MMPI suggests how, in certain 
cultures or environments, the answer "true" would be the response of an 
average individual, while in other cultures “false” would be the prevail- 
ing response. Such illustrative items are: 

“I go to church almost every week.” 

"T believe there is a devil and а hell in after life.” 

“I pray several times a week." 

“T read in the Bible several times a week.” 

“I do not like to see women smoke." 

“I believe women ought to have as much sexual freedom as men.” 

“Children ought to be taught all the main facts of sex.” 

It would seem that answers reflecting the individual’s culture апа, 
environment in these, and in other items, might so influence the score on 
the test that a given score would not have the same significance in differ- 
ent populations. 


548 Hugh S. Brown 


Some of the characteristics of the General College population which 
may influence the significance of scores on the MMPI are the cultural 
and economie status of the homes, their geographic location, the educa- 
tion of the parents, the religious background of the home, and the educa- 
tional aspirations of the parents for the children. Other factors which 
may influence the significance of scores are the general ability level of the 
students, their vocational aspirations, their success in achieving educa- 
tional and vocational goals, their success in finding a rounded social life 
in college, and the general morale of the student body. 

The General College population may be different from the general 
population simply because it is a college group. The academic standing 
of most of its members may make it a population different in many 
respects from another college group, or from an unselected college popula- 
tion. In any case it is possible that many factors such as those already 
mentioned have operated in the General College population to an extent 
sufficient to cause a difference in attitude which might account for scores 
on the MMPI which are significantly different, not only from the general 
normal population, but also from the original college normal group; and 
from other college populations on which data were avilable. In this case 
it would seem that some modification of the usual interpretation placed 
on an MMPI profile may be necessary to insure a valid estimate of the 
personality of individuals in this population. 

In the field of intelligence and achievement testing it has become 
accepted practice to establish local norms for the population in which the 
test is to be used, as a basis for counseling. In the field of personality 
testing, however, this procedure does not appear to have been followed 
to any great extent. Norms are usually established by the authors of 
a personality test, and these tend to be accepted as valid for all popula- 
tions. 

The differences among the college populations noted here, and the 
differences between these collegespopulations and the original "college 
normal” population would seem to be of sufficient magnitude and 
significance, even though slight, to emphasize the need of caution in the 
interpretation of the MMPI profile. Meehl (7) has pointed out that 
scales such as the MMPI acquire whatever non-statistical meaning they 
possess from the clinical description of those extreme deviates who make 
up the diagnostic categories for which the various scales are named. 
Hathaway and McKinley (3) have indicated that, while 70 is a borderline 

=> score, useful interpretation of a score always depends upon the clinician’s 
experience with a given group. If, for example, a deviate score of 70+ 
on the Ma scale of the MMPI can occur in one mainly non-pathological 
population with a frequency four times as great as in another mainly non- 


у 


Similarities and Differences on the Multiphasic —— 549 


ological population, it would seem that, without clinical evidence, 
score of 70+ does not have the same significance in the two popula- 

If а presumably non-pathological population has a mean profile 
the MMPI significantly depressed or elevated, with respect to the 
original group on which present norms are based, clinical experience with 


differences noted in this study are the result of cultural and environ- 
mental factors, rather than the result of a larger incidence of abnormal 
ounts of the personality traits being measured, it would seem impera- 
tive that personnel workers secure much more information than they now 
possess about the scores on the MMPI of the population with which they 
аге dealing, if they are to make valid interpretations of an individual 
profile in this population. 
Тће firs£ step in this direction would seem to be а testing program 
Which will furnish sufficient data to enable the personnel worker to deter- 
‘mine whether or not the population with which he is working is signifi- 
cantly different, with respect to the scores on the personality test, from 
the original population on which the norms for the test were established. 


Received January 15, 1948. 


References 


_ 1. Brown, H. S. An investigation of the validity of the Minnesota Multiphasic Person- 
ality Inventory for a college population, and the relationship of certain personality 
3 traits to achievement. Unpublished Ph.D. thesis, Univ. Minn., 1947. 
2. Eckert, R. E. Outcomes of general education. Minneapolis: Univ. Minn. Press, 
1943. 
3. Hathaway, S. R., and McKinley, J. С. Manual for the Minnesota Multiphasic 
Personality Inventory. New York: The Psychological Corp., 1943. s 
4. Loth, N. N. Correlations between the Guilford-Martin Inventory of Factors STDCR 
and the MMPI at the college level. Unpublished Master’s thesis, Univ. Minn., 
1945. 
5. Lough, O. M. Тевсћегв college students and the Minnesota Multiphasic Person- 
| ality Inventory. J. appl. Psych., 1946, 30, 241-247. 
6. McKinley, J. C., and Hathaway, S. R. A multiphasic personality schedule: II. 
A differential study of hypochondriasis. J. Psychol., 1940, 10, 255-68. 
7. Meehl, P. E. А general normality or control factor in personality testing. Psychol, 
[ Monogr., 1945, 59, No. 4. И 
8. Mechl, P. E., and Hathaway, S. R. Тһе K-factor as a suppressor variable in the 
Minnesota Multiphasic Personality Inventory. J. appl. Psych., 1946, 30, 
| 525-564. 
9. Torrens, J. К. An investigation and evaluation of ће Guilford. Inventory of Factors 
J STDCR with special reference to the MMPI. Unpublished paper, Univ. Minn., _ 
1 


10, Williams, C. T. These we teach. Minneapolis: Univ. Minn. Press, 1943. 


Television's Effects on Leisure-Time Activities 


Thomas E. Coffin 
Hofstra College, Hempstead, L. I. 


"Television is rapidly coming over the horizon as а major medium of 
communication. Though still circumscribed in its coverage, it is ap- 
parently due for relatively speedy expansion. Its advent has stimulated 
a host of guesses as to its coming effects on our lives.' 

In the belief that television’s effects—whatever their eventual 
nature—deserve serious and continuing study, the psychology depart- 
ment of Hofstra College has set up a Television Research Program to 
conduct periodic studies of the new medium’s social and psychological 
effects. The Program’s studies, originating in April, 1948, when tele- 
vision began to attract its first real attention, will be able to follow the 
changing patterns of its influence as the new medium grows. 

To orient ourselves to the problem and sketch in its broad outlines 
we began with a series of a hundred “depth interviews" of television 
families, These qualitative probings suggested that television may have 
а pronounced impact on set-owning families: television tends to pull the 
family together as a unit once more, preempts time and attention formerly 
given to hobbies, radio, movies and other leisure-time activities, and 
engenders an intensity of feeling which leads some to refer to their sets 
as “practically a member of the family.” 

These interesting findings we regarded as provocative hypotheses 
calling for more precise investigation. To put them to explicit test we 
designed a second series of interviews employing a matched-group 
technique to contrast the actual behavior of television with non-tele- 
vision families. The present report presents the findings of these 
matched-group interviews. 


Method’ 


Selecting a specified period of time for study—the first week in May, 1948— 
we questioned paired television and non-television families as to how many 
times they had gone to the movies, how much they had read, listened to the 
radio and participated in other leisure-time activities during the sample week. 
A comparison of the frequencies with which these two groups of families, 
television and non-television, engaged in these activities permits a cautious 

_ inference as to television’s influence on family activity patterns. 


1 The gist of these guesses is aptly summed up in one magazine's estimate that 
television may “change the American way of life more than anything since the Model 
T.” Time, May 28, 1948, p. 72. 

550 


T'elevision's Effects on Leisure-Time Activities 551 


The Matched Families. At this stage so little is yet known regarding the 
sampling characteristics of television owners? that it was impossible to set up 
proper controls for an accurate sampling study. Rather, we attempted to 
match each television family as accurately as possible for area of residence and 
socio-economic status with a non-television family. 

The interviews were done on Long Island by students in the author’s class 
in Consumer and Opinion Research, who had done several such surveys during 
the year. To hold constant interviewer bias and error in judging economic 
status they were done in pairs, one TV and one non-TV, by the same inter- 
viewer. For their non-television interviews they selected, on the same block, 
the house which most closely resembled the TV house in apparent socio- 
economie status. Since we were not trying to sample specific areas, inter- 
viewers tended to work in neighborhoods with which they were personnally 
familiar, which probably tended to increase the accuracy of their matching of 
TV and control families. 


Table 1 
Composition of TV and Control Groups 
Non-TV TV 
Која! number of familieg. укыл ее илт ЖАЛ NOn URS 187 187 
Ву socio-economic classes: 
"AP group’) ie ceu QR EEUU ERE NE US 18 
“В” group. cct TNT AAG 84 
“С” коприве бн О Se „эё 35 
Total number of реоре........................ Е 518 
Mean number per family 3.79 


The composition of the two groups is shown in Table 1. Two hundred and 
Seventy-four interviews were obtained, with 137 TV and 137 non-TV families, 
Tepresenting a total of slightly more than a thousand persons. The match is 
fairly close between the two groups, but the television group leans in the direc- 
tion of a slightly higher socio-economic status and slightly larger size of family. 
. Differential Effects on Different Groups? In contemplating the economic 
Impact of television two significant questions come up, having to do with 
ые differences in TV's effect as it strikes various ш іп Ње рори- 
ation. One has to do with its effects at various economic levels—are they the 
Same in the lower brackets as in the upper? The other concerns the influence 
of habituation—do these effects diminish as the family gets used to having their 

set around? Р 
. The number of our cases is too small to make detailed breakdowns very 
Significant; any findings must be interpreted with extreme caution. Subject 
to this caveat, analyses of our data by economic level were made and the results 
Vill be mentioned where they are of interest. 


* Cf. the disparities in the published figures on the economic levels of set owners. 
Television Magazine, April, 1948. 

* Sometimes it is the block’s most well-to-do home which has TV; in these cases they 
Were allowed to go to adjacent blocks for non-TV interviews. Interviewing the house 
next door was discouraged, because of the danger of more than ordinary intercourse and 
Mutual influence between such immediate neighbors. 

‘Our TV group turned out to include a high proportion of upper middle class (“В”) 
and no lower class (“D”) families, but we do not mean to imply that this necessarily 
Tepresents the economic distribution of television ownership in the population at large. 


second question, habituation, arises because some have suggested that 
television's effects may be transitory—that there may be an initial 
drop in other activities, after the family has had te TV set а few months the 
novelty will wear off and family members will go back to their previous habita, 


i we our TV group according to length 
ownership. We found that the group halved itself at the six-months poi 
we ths’ duration “new owners,” and those 
six months or more “old owners," Sixty-eight families are “now owners" 
69 are “old owners,” 30 of whom have owned sets for a or more. 

enough instances of long-term ownership to show up any marked 


THR 


television families сап be grouped into two categories: “out-of-home” 
entertainments such as movie going and sports attendance, and "at- 
home" activities such as reading and radio listening. Among television 
families the overall level of participation in activities outside the home 
was only about three-fourths that of non-television families; the specific 
figure, of course, varies with the activity in question. 

Motion Picture Attendance. The coming tug of war between tele- 
vision and the movies has already attracted considerable attention in the 

. entertainment world. Varied opinions have been expressed as to the 
extent of television’s encroachment upon movie attendance. Our own 
data suggest that it may be moderate. 

Fifty-nine per cent of our television families believe they attend 
movies less often since they got their television set. The figure is the 
same for both “new owners” (had set less than six months) and “old 
owners” (six months or more). By economie levels, the percentage is 
highest (69%) among the middle-class (“С”) group. Thirteen per cent 
of the total group also report that they enjoy movies less since having 
television in their homes. 

‘These answers are in terms of people’s recollection of the changes in 
their habits. What does their actual attendance for the sample week 
show, when compared with the non-television controls? Table 2 gives 
these figures. 

In the non-television group there were 61.6 attendances per hundred 
persons (one “attendance” = one person one time). In the television 
group there were 49.2 attendances per hundred—twenty per cent fewer 

> than in the control group. 

' . At all socio-economic levels the rate of attendance was lower among 
the television families than among the corresponding control families, 
though on the “A” level the difference is not statistically significant, due 


Television's Effects on Leisure-Time Activities 553 


Table 2 
Participation in “Out-of-Home” Activities 


Non-TV 
Families Families 
Attendances per 
Type of Activity: Hundred Persons 
Eee cc esu pre SERA Ж 61.6 49.2 
eer 63.0 45.0 
(Other сопишегсїл!).................... (54.4) (38.3) 
азаа ЬН ОЙ 124.6 94.2 


у the small number of cases here. The difference was greatest (33%) in 
је middle-class (“C”) group. 

The difference in movie attendance by adults (28%) is more marked 
ап the difference in children's attendance, which is so slight (7%) as to 
e statistically unreliable. There is no reliable difference in rate of 
tendance between “new” and “old” owners. 

"Other Forms of Entertainment. Movie-going was the most frequent 
igle activity, with sufficiently high attendance rates to yield relatively 
liable figures. The figures on other entertainments are reliable only 
then several activities are grouped together; we cannot single out specific 
rms for comparison. 

Grouping together all other out-of-home entertainments except 
novies, we find the participation of television families to be 29% lower 
ап that of control families (Table 2). It is lower on all economic 
p but again the difference is not significant on the “А" level. The 
decline is relatively similar for adults and children. Old owners' level 
of participation is slightly but not significantly lower than that of new 
— Commercial types of entertainment (involving paid admission) seem 
0 suffer slightly more than non-commercial types (parties, socials, etc.). 
The greatest decline appears to be in such activities as dining, dancing, 
ht-clubbing. The smallest difference between TV and control 

les was in their attendance at sports events (baseball, the fights, 
è races) but the figures here are too scanty to be reliable. 

Combining movies and other form of entertainment we find that the 
al participation in out-of-home activities is about 24% less for tele- 
Vision than for non-television families. The decline seems to be greatest 
the middle-class (“C”) level and it is as evident for old owners as it is 
new owners. 


TN 


554 Thomas E. Coffin 


II. “At-Home” Activities 


Not only entertainments outside the home but also leisure-time activi- 
ties carried on at home suffer under the influence of television. Perhaps 
the two most important of these are radio listening and reading, compet- 
ing with television as media of communication and advertising. 

Radio Listening. The television group averages slightly more radio 
sets per home than the control group (medians are 3.9 and 3.4, respec- 
tively) suggesting that before television they may have been even more 
active radio listeners. We have already seen that they spend less time 
out of the home, so that apart from television itself they seem to have 
equality of opportunity for listening to the radio, in point of time and 
sets available. 


Table 3 
“At-Home” Activities 
Per Cent 
Decline 
by Non-TV "qv in TV 
Activity: Families Families Group 
Hours of radio listening: 
Day (before бру во uem Lus 3.5 2.6 26% 
Night (after 6 pym)............... Less. 34 11 68% 
Percent of family listening: 
Da; 38% "E 
51% 31% 
17.5 18% 


As Table 3 shows, however, television owners use their radios 26% 
less during the day and 68% less at night than do non-television families. 
The greater decline in nighttime listening, of course, reflects the fact 
that the bulk of television broadcasting is still in the evening hours. 

The greatest difference between television and control families ap- 
pears on the middle-class level, where non-television homes average 
nearly four hours per night as against one-half hour for television homes. 
There are no significant differences between new and old owners. 

The drop is not only in the amount of listening but also in the number 
of people customarily listening when the radio is on. During the day- 
time there is virtually no difference, but at night three-fourths of the 
members of non-television families usually listen to the radio, where only 
half the members of television families listen. 

Reading. Reports on the number of hours spent in reading during the 
preceding week are perhaps less reliable than reports on other activities, 

due to greater difficulties in memory and greater temptation to falsify for 


= 


Televisign's Effects on Leisure-Time Activities 555 


of social prestige. These cautions should be remembered in 


е effect of television upon reading habits will be especially interest- 
'to follow in its development, for both are essentially visual media and 
th (unlike radio) require relatively undivided attention. Thus in a 
nse these two might be thought of as being in more direct competition 
in even television and radio. 
m a number of comments made in the preceding depth interviews, 
ad gained the impression that reading might suffer considerably in 
es with television. Our present data do not support that impression. 
pared to radio listening and many other entertainments reading is 
its own relatively well in our TV families. Table 3 shows an 
ll decline of only 18% for hours spent in reading during the sample 


e are no consistent trends in reading changes according to 
ic levels. In both TV and control groups, middle-bracket people 
reading than upper class families, but at all levels TV families 
less than the corresponding controls. 
Television did not bring any change in type of reading between the two 
з. For both, most of the reading time is given to newspapers and 
to books. By economie levels, the “А” families read more books 
the “С” families more newspapers, as would be expected. This was 
ein both TV and control groups. There was little difference between 
mew and old television owners in either amount or type of reading. New 
ers did slightly more (18 hours per family) and old owners slightly less 
ing (17 hours). 


III. Attitudes Toward Television 


Television owners make relatively extensive use of their sets. Their 
des toward television in general are enthusiastic and they are favor- 
4 inclined even toward the advertising which appears on this medium, 
z it preferable to radio's commercials. 

— Amount of Television Viewing. In relation to the still limited number 
of hours of television broadcasting available during the week, television 
ies use this medium extensively. They reported an average of 
hours of viewing during the week. Set usage was slightly higher in 
upper economic brackets, and averaged 2.5 hours more for new than 
' old owners. 

During these hours that the ТУ sets are in use relatively large num- 
of people are watching them. There was an average of 3.56 viewers 
set, as compared with 1.9 night-time radio listeners in these same 


During the week of this survey, stations in the area were averaging about 25 to 30 
os of broadcasting per week. 


` 


556 Thomas E. Coffin 


families. When family size is held constant, there are proportionately 
mote viewers per set on the middle-class level than on the upper levels. 

Opinions of Television in General. By and large television owners are 
enthusiastic about the medium and are happy to talk about it. Rapport 
was unusually easy to establish with these respondents and they were 
remarkably interested in our survey. 

We offered them a simple five-point rating scale, with adjectives 
ranging from “wonderful” to “disappointing,” to express their opinion 
of television in general. The results, in terms of the percentage of 
respondents selecting each category, are as follows: “Wonderful” —55%; 
“Good” —37%; “Fair” —6%; *Poor"—145; and “Disappointing” —1%. 

Better than half of them described television as “wonderful”; less than 
ten per cent rated it anything less than “good.” The distribution of 
answers for old owners is very similar to that for new owners. 

Asked for a fuller description of their attitude, owners gave a variety 
of comments. Perhaps the most common one involved the thought that 
television is “the closest thing to actually attending the broadcasted 
event,” that “without leaving your living room the world is practically 
before your eyes." Many tempered their endorsement with the ob- 
servation that there is “still plenty of room for improvement"; they seem 
to have faith that this will come with time. 

Comparison of Radio and Television Commercials. These predomi- 
nantly favorable attitudes seem to carry over also to the advertising which 
appears on television. In the depth interviews many spontaneous favor- 
able comments were made about TV commercials. Some of the com- 
ments led to our including a query as to which the respondents liked 
better, the advertisements on television or those on radio. 

This is a question on which owners’ opinions are rather definite; only 
3% said “neither” or “don’t know.” Six per cent preferred radio and 
91% preferred television commercials. Preference for TV commercials 
increases as we go down the socio-economic scale, but is less pronoun! 
among old owners than new owners. 

Reasons frequently mentioned for preferring TV commercials were 

. that they are “more vivid,” “more bearable” and that “not so much is 
left up to your imagination.” Some who preferred radio gave it a back- 
handed compliment: “radio ads are easier to ignore,” “it takes no effort 
not to listen.” 

"Psychological Duration" of ТУ vs. Radio Commercials. Taking 
advantage of the functional relation between interest and estimation of 

` temporal duration, early radio research used the device of asking how long 
advertisements lasted as an indirect index of listeners’ interest in the 
commercials. We adapted the same device to provide a comparative 

index of interest in television versus radio commercials. 


Television's Effects on Leisure-Time Activities 557 


For radio and television separately, we asked our TV respondents 
how many minutes out of every quarter hour they thought were spent on 
‘advertising. Comparing the two estimates, we find that 74% of the 
ndents felt that the number of minutes per quarter hour on advertis- 
"ing was greater for radio than for television; 1095 thought the number 
"was greater for television and 16% thought they were equal (Table 4). 
It is interesting to note that as we go down the socio-economic scale 
"there is a steady increase in the percentage of answers favorable to tele- 
vision. On the “A” level 61% and on the “C” level 83% feel that 
radio advertising is more time-consuming. Perhaps this is in part a 
residue of the differences in the radio programs the different classes 
customarily listen to and in part a reflection of the generally greater 
regard in which the lower groups seem to hold television. 

Estimates by new owners are more favorable to TV than those of old 
owners. This is consistent with our previous findings on commercials. 
In reactions to advertising we have an area where the “novelty effect” 
does seem to be in evidence. 

Though broadcasters’ current time limits on advertising are roughly 
similar for the two media, our results may reflect some actual difference 
in the duration of radio and TV commercials. But certainly they 
picture, too, the shorter “psychological duration” of television commer- 
cials, and hence reflect their greater interest to the audience. In its 
effect on people's attitudes, the amount of time actually spent on ad- 
vertising is less vital than the amount people feel is given to it. r 


Table 4 
Apparent Duration of TV and Radio Commercials 
Radio Television 
С о ооо ВИЛО MESURES EU 


Spends more minutes on advertising: : 74% 10% 
Mean estimated minutes per quarter hour on advertising: 3.98 241 
К, 08000 minis pee quare eue ИВАНА каш шыкса ш 


What are their impressions in actual number of minutes per quarter 
hour? Table 4 shows that they believe radio spends four minutes and 
IV two and one-half minutes out of every fifteen on advertising.’ This 
means that radio commercials seem to occupy 65% more time than do 
commercials—a strong hint of the audience’s greater interest in 

- lelevision's presentations. 
* We checked this estimate for radio against the control group's answers to the same 
question. The respective means of 3.98 and 4.09 minutes for radio are very similar and 
д mde that there is no constant bias affecting the TV group's estimates of time intervals 


558 Thomas E. Сойп 


Again we find that the difference between the two media is felt more 
acutely as we move down the socio-economic scale. Owners in the upper 
bracket judge that radio spends 5075 more time on advertising while 
middle-class owners feel it spends 88% more time. New owners, too, 
are considerably more impressed with TV’s relative brevity than are old 
owners. 

Summary and Conclusions 


1. The results of the present study suggest that television may bring 
about appreciable changes in the family’s pattern of leisure-time activi- 
ties. Comparing the activities of matched groups of 137 television- 
owning families and 137 non-owning families during a sample week in 
May, 1948, we found that television families showed a considerably lower 
level of participation in most other types of activity. 

2. Television families engage in fewer activities outside the home 
than do comparable control families; their general level of participation 
was about three-fourths of that shown by the controls. Inside the home 
there is also a shift in the proportion of time devoted to other activ- 
ities, with night-time radio listening declining most and reading least 
among the television families. 

3. Television owners are enthusiastic about this medium. They use 
their sets extensively and have a high opinion of television in general. 
Even the advertising is much preferred to the commercials on radio. 

4. The fact that these attitudes and these differences in habit patterns 
are for the most part about as evident among owners of many months’ 
standing as they are among those whose sets are still a novelty suggests 
that television’s influence may not be a transitory phenomenon, passing 
when owners become habituated to their sets. 

5. Analysis of our data by socio-economic status suggested that there 
may ђе a tendency for television's influence to be felt more strongly: 
perhaps, among the middle-class families of our group than among those 
higher in the socio-economic scale. However, the number of our cases 
is too small for us to draw any more than tentative hypotheses from the 
results of breakdowns by either length of ownership or economic status. 

6. Television as a medium of communication, entertainment and 
advertising appears to exert an appreciable influence in the lives of set- 
owning families. On a nationwide scale this influence is probably neg- 
ligible as yet, but our evidence suggests that as the medium becomes 
accessible to increasing numbers in the population it may bring with it 

~noticeleab effects on the family’s activities in and out of the home. 
Received August 10, 1948. 
Early publication. 


Identification of Cola Beverages: II. А Further Study 


J. W. Bowles, Jr. and N. H. Pronko 
University of Wichita 


н In an earlier study, the present investigators gave four different Cola 
e es (Coca Cola, Pepsi Cola, RC Cola and Vess Cola) to 108 Ss 
identify. Results showed an almost total absence of Vess Cola identi- 
tions. Instead of responding with the fourth brand name, Ss tended 
0 repeat the name of one of the other three beverages listed. "These 
esults were interpreted as indicating lack of a gustatory basis for the 
s identifications. It was suggested that these responses were a function 
) a ready labelling of the series of Cola beverages with a stock of naming 
eaetions that seemed to be related to thoroughness of advertising and 
her forms of culturalization. 
— Further confirmation of the correctness of such an explanation came 
rom the results of administering four samples of the same Cola beverage 
spectively to each of four groups of 15 Ss. The picture was not es- 
entially different from that obtained with the 108 Ss. Аз a result, the 
hypothesis was developed that if only three beverages were used, the 
entifications would be distributed in an order approximating chance. 
he present experiment was designed as a test of the above hypothesis. 


Procedure 


_ The subjects of the present study consisted of two groups—96 Ss in 

Part I and 60 in Part II. These were beginning students in Elementary 
Sychology courses. 

Pari Т. Each of 96 Ss was admitted individually into the experi- 

tal room and was invited to sit down. The following instructions 

then read to him. 

“We would like to have you taste and ИКАН some Соја drinks. You will 
ld in what order and when you are to drink them. After you have finished 

Sample, report your identification to E and take enough water from the 

r cup before you to rinse your mouth well." 

A tray containing three one-oz. glasses of Coca Cola, Pepsi Cola and 

Cola respectively was placed before the S. He was then told to 

the beverages labelled x, y, and z in the order indicated to him“ 


__Ргопко, N. H., and Bowles, J. W., Jr. Identification of Cola beverages: I. First 
+ J. appl. Psychol., 1948, 30, 304-312. 
559 


560 J. W. Bowles, Jr. and N. H. Pronko 


Samplings were spaced about а minute apart, S's name and other infor- 
mation being recorded in the interval between drinks. 

'The order of presentation of the three beverages, determined pre- 
experimentally, was such that each of the three stimuli appeared in the 
first, second and third position 32 times. This counterbalanced order 
was used to preclude the operation of position effects or stimuli inter- 
actions orally. All beverages were kept out of sight of Ss and were placed 
in a refrigerant maintained at approximately 5?C. 

Part II. In Part II, 60 Ss were administered the same Cola drink at 
each of three trials. "Thus, 20 got all Coca Cola; 20, all Pepsi Cola; and 
20, RC Cola. In all other respects, the procedure was the same as that 
of Part I. 

Results and Discussion 


Inspection of Table 1 shows that, as in the previous study which 
utilized four different Colas, the three most common identifications are 
apparently related to the three most frequently advertised Colas with а 
sprinkling of such unexpected beverages as Root Beer, Dr. Pepper, Nehi, 
and Red Rock. 

Table 1 
Showing the Distribution of 288 Identification Responses When Each of the 96 Ss Was 
Presented in Turn, but in Counterbalanced Order, with а 1 oz. Sample of Coca 
Cola, Pepsi Cola, and RC Cola 


Frequency of Ss’ Various Identification Responses 


Brand Fount. Root Red 

Given S С.С. Pep. R.C. Dr.Pep. Cleo Coke Beer Rock Nehi D.K. Totals 
qur а МА HUS SERIEN ШКУ ЙЫЛ ei Tet C REIS RET SAT S DARE 
Coca Cola 39 26 22 1 TW 1 6-5 0d 
Pepsi Cola 35 36 20 1 4 96 


ROCoa 15 34 84 2 4 2 1 4 90 
Totals ВОВ ООА ЕИ аи Иа. 3: 7.147 88 


ЖЕНЫ ври ДАЛИ ВО АДИ ICE I е RUNE oi 


Coca Cola is properly identified 39 times but is misidentified as Pepsi 
Cola 26 times and as RC Cola 22 times while Pepsi Cola is correctly identi- 
fied 36 times but is also misidentified as Coca Cola 35 times and as RC 
20 times. RC Cola is correctly named 34 times but is misidentified as 
Pepsi Cola exactly as often and as Coca Cola 15 times. Perhaps the low 
frequency of misidentifications as Coca Cola is due to the higher frequency 
of misidentification with other beverages. 

4 From Table 2 of Part II (where each of 20 Ss was given three samples 
of the same Cola) it will be noted that results are not much different. 
Coca Cola is identified as Coca Cola 27 times but is misidentified аз 
Pepsi Cola 20 times and as RC nine times. However, when Pepsi Cola 


Identification ој Cola Beverages 561 


Table 2 


Showing the Distribution of 180 Identification Responses When Each of the 60 Ss Was 
Presented with Three 1 oz. Glasses of Either Coca Cola, Pepsi Cola, or RC Cola 


Frequency of Ss’ Various Identification Responses 


Brand Se ———————__————————— 

Given S C.C. Pep RC. 7Up  DrPep. Vess D.K. Totals 
Coca Cola 27 20 9 1 1 2 60 
Pepsi Cola 22 19 17 2 60 
RC Cola 27 15 17 1 60 
Totals 76 54 43 1 3 1 2 180 


is given three times in succession, it is said to be Pepsi Cola 19 times, Coca 
Cola 22 times and RC 17 times. Аз regards RC Cola, it is correctly 
identified as RC only 17 times but wrongly identified as Pepsi Cola 15 
times and as Coca Соја 27 times! In every instance, regardless of the 
stimulus used, Coca Cola is the response of greatest frequency. It is 
conjectured that these results may reflect the relative effectiveness ог 
extent of the advertising employed by the three main Cola competitors. 
Table 3 shows the percentage of correct responses when Ss were given 
three different Colas. Note that for Coca Cola this percentage is 41 as 
' compared with 38% for Pepsi Cola and 35% for RC Cola. It is sug- 
gested that the slight differences among the three categories of correct 
identifications is a function of a relatively greater frequency of certain 
naming responses. Apparently this interpretation is valid because an 
examination of Table 4, which shows classification of identification re- 
sponses when the three samples consisted of the same Cola for each $, 
indicates a similar trend. Although Coca Cola is given to the Ss each 
of three times, it is correctly identified 45% of the time but is misidentified 
55% of the time, this, despite the fact that Coca Cola naming responses 
constituted 76 of the total 180 responses. Although the Cocal Cola 
response is given over and over, nevertheless it does not yield a high © 


Table 3 
Identification of Cola Beverages by 96 Ss When Each S Was Presented a 
Sample of Each of Three Brands 
САШИ: ЫЙ эур rore corn мане a at 4 
Brands of Cola Presented 
ар а nee a ctore == 
Identification Coca Cola Pepsi Cola RC Cola Totals 
No. Pet. No. Pet. No. Pet. No. Pet. 
Же... эы} А St nee Sto re a EET o. 1 
Correct, 39 4l 36 38 34 35 19 38 © 
Incorrect 57 5 60 62 62 65 179 62 


Totals 96 100 96 100 96 100 288 100 


MÀ ne ae 


562 J. W. Bowles, Jr. and М. Н. Pronko 


batting average. As regards Pepsi Cola, it is correctly identified only 
32% of the time and is misidentified over twice as often (68%)! 

Results for RC Cola are even more striking. This beverage is 
misidentified 72% of the time. The low percentage of correct identifica- 
tion (28%) is, perhaps, a function of the greater frequency of occurrence 
of the Coca Cola response. Ss could not get in as many RC Соја namings 


Table 4 
Identification of Cola Beverages by 60 Ss When Each S Was Presented 
Three Samples of the Same Brand 
Brands of Cola Presebted 

Identification Coca Cola Pepsi Cola RC Cola Totals 

No. Pet. No. Pet. No. Pet. No. Pet. 
Correct 27 45 19 32 17 28 63 35 
Incorrect 33 55 41 68 43 72 117 65 
"Totals 60 100 60 100 60 100 180 100 


because they had exhausted this opportunity by giving the “Coke” re- 
sponse too often. The overall picture shown in Table 4 is also important. 
Тће total number of correct identifications, 63 out of 180, gives a value 
of 35%, which means that 65% of the responses were misidentifications. 
These results are in line with the expected 3314% of correct namings, 
which might occur “by chance." 

In the previous study, when four different Cola beverages were 
employed, results suggested that the pattern of naming responses was а 


Table 5 


Critical Ratio Tests of the Hypothesis That the Distribution of the Various Identification 
Responses to the Three Cola Beverages Are Not on the Basis of Actual Taste Stimuli 


How Identified 
ПИ сл учуз ажв 
Аз Соса Соја As Pepsi Cola As RC Cola 
LA Eo Cr а S Ss IE uua. 
Е Critcal Critical Critical 
Used Dif озш Ratio Dif ca Ratio Dif cai Ratio 


Coca Cola — .105 071 1478 062 064 .968 043 (073 589 
Pepsi Cola .060 .070 .942 „м2 .067 .626 070 .072 972 
RC Cola 164 .130 1.184  .021 .067  .313 A14 .077 1480 


eee 


„function of the Ss’ familiarity with Cola brand names. If that hypothe- 
sis is correct, then in this study with use of three brands of Cola, we 
should expect on a statistical basis to get а chance distribution of Cola 
names regardless of beverage employed. Actually, Table 5 proves our 


Identification of Cola Beverages 563 


esis. The correct identifications of the three respective Colas do + 
differ significantly from chance expectancy since it will be observed 
that no critical ratio approaches 2.0 and only three are above 1.0. In 
other words, in applying names to identify the three Colas our Ss might 


Table 6 


Critical Ratio Tests of the Hypothesis That the Distribution of the Various Identification 
Responses to the Three Cola Beverages Are Not on the Basis of Actual Taste Stimuli 


у aa 


How Identified 
As Coca Cola As Pepsi Cola As RC Cola 
Critical Critical Critical 


Diff cai Ratio Diff cai Ratio Dif cain Ratio 


:022 .076 .280 .037 0595 8627 124 092 1.340 
044 0077 .571 .019 089 .213 062 .101 613 
022 016 .280 055 086 .639 4062 .101 6138 


just as well have drawn such names from а hat. Comparison of Table 5 

with Table 6, which latter shows results of Part II where each of the three 

stimuli given Ss were the same, indicates similar results. Critical ratios 

for percentage of correct responses again do not show a difference from 

chance expectancy. With one exception (a CR of 1.3), all CRs are below 
Table 7 


Critical Ratio Tests to Determine Whether Differences Between Percentages 
in Results of Part I and Part П Are Significant 


Brands of Cola Presented 
Statistic Coca Cola Pepsi Cola RC Cola Totals 
eee s no i90 lE. 

Pi (% correctly 41% 38% 35% 38% 
identified in 
Part I) 
P; (% correctly 45% 32% 28% 35% 
identified in 
Part П) : 
Р,-Р, 4% 6% 7% 8% 
тай 081 .078 076 046 
Critical Ratios 494 769 . .921 .652 


ES -— o. е А7). 7 еа совету e re a 


` As a final test of our hypothesis, we present the data of Table 7. 

ere are compared the correct responses in Part I (three different Cola „ 
Samples) and Part II (three samples of the same Cola). The differences 
№ correct naming responses are not statistically significant as evidenced 

Y the extremely low significance ratios. For the Coca Cola, Pepsi Cola 


564 J. W. Bowles, Jr. and У. Н. Pronko 


and RC Cola categories the CRs are respectively .49, .77 and .92, indicat- 
ing that the pattern of naming is essentially the same regardless of 
presentation of (a) three different samples of Cola or (b) three samples of 
the same beverage. 


Summary and Conclusions 


A group of 156 Ss was asked to identify one-oz. samples of the follow- 
ing three Cola beverages: Coca Cola, Pepsi Cola and Royal Crown (RC) 
Cola. In Part I, 96 Ss were presented one of each of three different 
Colas and in Part II, 60 Ss were given three samples of the same beverage, 
being evenly divided among the three different classes. 

In general, results show that whether Ss are given three different 
beverages or the same beverage three different times, the identifications 
are not essentially different in the two cases. All critical ratios are 
extremely low and lack statistical significance. Within the limits of the 
present experiment, the findings permit the generalization that when 
subjects are asked to discriminate and identify Cola drinks, they might 
do just as well by drawing the names of those beverages out of a hat. 


Received February 6, 1948. 


| 
| 


Book Reviews 


Joseph. Industrial psychology. New York: Prentice-Hall, Inc. 
md edition, 1947. PP. xxi and 553. $5.35. 

book is a revision of Dr. Tiffin’s earlier edition. Two new 
iapters have been added, one on interviewing and related employment 
nethods and one on wages and job evaluation. The other chapters have 
ееп revised to include some recent studies. The general organization 
d method of presentation are unchanged. 

Chapter One deals with individual differences and is followed by the 
ету chapter on employment methods. This chapter covers the topics of 
viewing, job analysis and specifications, statistical analysis of 
el data, statistical and clinical use of application blanks, and 

ar phases of the employment procedure. ; 

The next five chapters deal with the general principles and specific 
plications of employment tests. The discussions of the tests are brief 
cover a representative sample of tests. These chapters are well done 
cover the area adequately. There is a good emphasis on tests for 
cing as well as selecting new employees. Validity data in terms of 
sts—a very desirable criterion that is not often used—are presented. 
concept of the selection ratio is discussed in a very readable manner. 
lapter Nine is a case study of psychological contributions in testing 
training brought to bear upon a specific job, and it synthesizes these 
cedures excellently. The usual topics of rating scales, safety, train- 
efficiency and morale are considered in the other chapters that 
rise the book. 
Although the entire presentation is very readable and generally 
Satisfactory, there are quite a few questionable details scattered through- 
t that weaken the book. For example, some correlation coefficients 
е given unusual interpretations. Oner of —.07 is said to be significant 
hile others ranging as high as .35 are held insignificant (pp. 305, 306, 
, 
In other places the statistical significance of results is questionable, 
there is no discussion of tests of significance. It would probably have 
een desirable to include this concept, particularly since statistical tech- 
niques as advanced as factor analysis are discussed. To give one example, 
‘data are given comparing successful and unsuccessful laundry “pressers” 
he overlapping of the groups, the critical ratios of the obtained differ- 
s, and even the numbers of cases are not mentioned (p. 34). 
565 


566 Book Reviews 


The method of drawing graphs is another detail in which a different 
manner of presentation would have been more satisfying, at least to this: 
reviewer. In many instances inter-relationships are shown by drawing 
trend lines that connect the midpoints of class intervals, Scattergrams 
would have been more meaningful. This may be shown by reference to 
one graph showing the relation of job performance to test scores for а 
group of “quillers” (p. 141). With successively higher minimum ac- 
ceptable test scores, the average production index is also higher. To get 
successively higher minimum acceptable test scores, it is apparently 
necessary to drop off some cases from the low end of the distribution of 
test scores. Thus, each succeeding class interval has fewer cases than the 
next lower class interval. There are only 28 quillers involved in this 
study. Each class interval above the first one must accordingly have 
some number of cases less than 28, but it is not possible to determine how 
many since more complete data are not given. This may mean that the 
high end of the trend line is based on so few cases as to be nearly spurious, 
although the graph as presented apparently shows a very high relation- 
ship of test scores to production. It is unfortunate that more informa- 
tion, such as in the nature of a scattergram, is not given. 

Other graphs show other kinds of weaknesses, some of which may be 
mentioned. One bar graph has no scale (p. 255). A learning curve is 
labeled “slow gains" at the place where it is most accelerated (p. 252). 
A graph shows the effect of occupational eyewear in one situation (p. 233). 
Six employees provided with glasses eventually produced more than did 
a “control group” without glasses. But the six had a more rapidly 
accelerating output curve than did the controls, even before the glasses 
were introduced. Thus, the value of the controls used in this study is 
questionable. The six are said to have reached a higher peak of produc- 
tion than did the controls, but this latter phenomenon is not shown in the 
figure. Many other graphs violate the rules laid down by a Committee 
of the American Statistical Association by failing to begin the vertical 
and horizontal seales with zero. The effect is to exaggerate relationships 
for the unwary. ‹ 

| In two places the Book may be criticized оп a broader basis. In the 
discussion of merit ratings, and again in the discussion of job evaluation; 
the technique of factor analysis was applied. In each instance the end 
result was an abbreviated scale. It is highly questionable if these 
recommended scales are as desirable as might at first appear. 
e Based largely upon a factor analysis a merit rating system was de- 
veloped that consisted of two factors or items, job performance and super- 
visory potential. It is questionable if this two item rating represents any 
advance over non-analytical techniques wherein a supervisor ranks his 


Book Reviews 507 


employees on their effectiveness, and ranks them in order of promota- 
bility. ә - 

The inclusion of a rating for supervisory potentiality violates one of 
the widely accepted principles of merit rating. That is, Tiffin advocates 
rating potential as well as actual behavior, with no justification for this 
step. Nor does he question the ability of supervisors to rate the poten- 
tialities of their subordinates. 

In similar manner, a factor analysis of some job evaluation systems 
leads to the recommendation of a system involving only two or three 
factors (p. 390). The analytical values of evaluating jobs are lost if only 
two or three factors are to be considered. The primary purpose of a job 
evaluation program in any concern is to maintain harmonious working 
conditions or morale. It із not to establish a scientifically "true" basis 
for paying wages. Rather, it is to establish an equitable and practical 
system and to have the employees realize that they are being paid accord- 
ing to an equitable and objective system. Tiffin apparently recognizes 
this, as shown by some of his discussions, but he sacrifices the human re- 
lations aspect of job evaluation when he advocates a system that is 
statistically sound but psychologically unsatisfying. Tiffin undoubtedly 
performs a service in showing that job evaluation systems need not be 
based on twenty or thirty factors, but he appears to have leaned over a 
little too far in recommending a very small number of factors. Perhaps 
about five factors are best, considering the psychology of the situation. 

_ In summary, this book is valuable as a text for a beginning course in 
industrial psychology. But it should be supplemented by other sources 
and it will be most effective when analyzed carefully by the instructor. 
Harold F. Rothe 
Stevenson, Jordan & Harrison, Inc., 
Chicago, Illinois 


Brodman, K. Men at work. Chicago: Cloud, 1947. Pp. 191. $2.50. 
Та surprisingly few pages Dr. Brodman has presented an extremely 
Teadable book about the “Supervisor and His People." This book, 
ased on experiences in the Cornell Caterpillar Tractor Company per- 
Sonnel program, is directed at front line supervisors in the hope that they 
Vill be aided toward a better understanding of their workers and them- 
selves. The author presents his message largely in dialogue form featur- 
mg à paragon of supervisors as he deals with various “types” of workers, 
Such as “complainers,” “shirkers,” “slow learners." Joe, the supervisor- - 
narrator, is indeed a composite of many desirable supervisory character- « 
Isties and virtues. Although this book will find its way into the libraries 
of few professional psychologists, it should be required reading in any 
Supervisory training program. 


568 Book Reviews 


The advantages of this volume can be simply stated: (1) it is easy to 
read and to understand (because of the careful choice of words and the 
use of the popular idiom, the dialogue presentation, and the summary and 
suggestions after each “ргоЫет”); (2) it contains tested psychological 
principles and methods for employee counseling (such as being а good 
listener, the limited use of advice, aiding catharsis, utilizing experts, 
environmental manipulation); (3) it answers a definite need for 8 super- 
visor’s handbook in human relations. 

There are, however, certain objections to the book which this reviewer 
feels should be mentioned. First, there is a definite suggestion of pater- 
nalism in the sub-title, “Тһе Supervisor and His People" (italics mine), 
as well as in Joe’s somewhat irritating omniscience. Second, the reader 

' who is a supervisor may get the false impression that simply by following 
Joe’s example he can be successful in keeping his department happy and 
efficient. Only to а certain degree is this true. Actually, a greater 
amount of space could have been devoted to a realistic appraisal of what 
percentage of “successes” a supervisor might reasonably expect. 

These reservations, however, do not detract seriously from the merits 
of Men at work. Dr. Brodman has done a commendable service for the 
advancement of human relations in industry by writing a book for front 
line management, the supervisors. 


А William A. McClelland 
Brown University 


Shartle, Carroll L., (with the assistance of Sanford Cohen). Vocational 
counseling and placement in the community in relation to labor mobility, 
tenure, and other factors. New York: Social Science Research Council, 
1948. Pamphlet 5. 

This memorandum prepared for the Committee on Labor Market 
Research of SSRC is concerned with “the effectiveness of vocational 
counseling and placement in the community and particularly with the 
role of employment services and their impact on the distribution of 
labor." Its purpose is to explore the directions in which research might 
be planned. 

This is an important publication because it directs attention to the 
need for evaluating the effectiveness of vocational counseling and place- 
ment as а phase of research on labor market processes and behavior. It 
is an important statement for psychologists because it points to an area 
of research involving a joint relationship between applied psychology 8 
applied economics and sociology which has been largely overlooked. 

In general, this reviewer found the specific questions, considere 
apart from their topics, to be significant and suggestive of important 


Book Reviews 569 


inquiries. In propounding the topics, however, the writers are consider- 
ably less articulate. A confusing factor is the failure to differentiate 
- between vocational counseling and placement and to allocate to each its 
proper role in the entire vocational adjustment process. 

A second lack of clarity stems from the major emphasis upon the role 
of the public employment service. The treatment in this memorandum 
might better have been limited to those issues which are common to all 
programs of counseling and placement regardless of their sponsorship. 
- This is particularly relevant when it is observed that the most extensive 
program of vocational counseling now in existence is under the auspices 
of the Veterans Administration and that this is accompanied by what 
amounts to a program of placement relatively independent of the public 
employment service. Also, the role of public education looms large both 
at the present time and in the future in the provision of counseling serv- 
ices and, to a lesser, extent, placement. It may be anticipated that an 
increasing amount of adult vocational counseling will be under the 
auspices of adult education programs. 

The usefulness, in terms of stimulation of research, of this memoran- 
dum would have been greatly enhanced if the authors had keyed in the 
bibliography to the questions so that the prospective researcher would 
_ have a more adequate guide to the work that has been done on specific 
issues, Such an arrangement also might have brought to the attention 
4 the authors the fact that their bibliography, although admittedly 
- "selected," omits what are perhaps the most significant researches and 
accounts of operating programs in this general area. A major omission is 
the volume Men, women, and jobs which summarized the extensive re- 
search on individual diagnosis and training done by the Minnesota Em- 
Ployment Stabilization Research Institute in the early thirties. Al- 
though this research program set the stage for the later work of the 
Occupational Research Program of the U. 8. Employment Service, it is 
hot referred to; yet the researcher who fails to start his survey of the 
literature with this account and with study of the monographs on which 
it was based, will miss the classic in one important phase of this field. 
Reference is lacking also to the highly significant studies of the E. S. R. I. 
just prior to World War II when such titles as “Employment Prospects 
of Men and Women Registrants in the United States Employment Serv- 
ite in St. Paul," “Mental Competence of Men and Women Registrants in 
the United States Employment Service in St. Paul,” “Dominant Causes 
of Unemployment, Duration of Unemployment, and Relief Status of 
Unemployed Registrants in the United States Employment Service in 
t. Paul,” and similar ones were issued. To the reviewer’s knowledge, 
the only quantitative evaluations of employment agencies made to date 


570 Book Reviews 


are the two by Paterson and Kriedt reported in the Personnel Journal and 
in Occupations in 1947 but which are unmentioned in the present publica- 
tion. Certainly the counseling manuals put out by both the U. S. Em- 
ployment Service and the Veterans Administration also should be brought 
to the attention of the reader qualified in the field of economies but lack- 
ing background in the field of counseling. The U. S. E. S. study in St. 
Louis on counseling in the Employment Service also deserves mention. 
Others can be recalled. 

Undoubtedly this memorandum will serve to initiate students of 
labor market problems into the issues involved in evaluating programs of 
counseling and placement. 


Arthur H. Brayfield 
University of California, 
Berkeley, California 


Reports Submitted to the Civil Service Assembly by the Committee on 
Placement in the Public Service and the Committee on Probation in 
the Public Service. Placement and probation in the public service. 
Chicago: Civil Service Assembly, 1946. Pp. v-xvi; 201. 

The foreword to Placement and probation in the public service, by 
James Mitchell, Director of the Civil Service Assembly, closes with these 
words: “It should be made clear that these reports were not prepared 
with a view toward their official approval or formal adoption by the Civil 
Service Assembly, its Executive Council, or its Headquarters staff, and 
no action of this nature is contemplated. Indeed, such action is hardly 
required, for these forward-looking reports are the fruits of the joint efforts 
of competent chairmen and their able associates. Аз such, they can afford 
to stand on their own authority in commanding the reader’s attention.” 

Even though psychology, as compared to public administration, has 
long been known facetiously as the science of verifying the obvious, Mr. 
Mitchell’s foreword, plus the absence of content in the book under review, 
should hasten the day when public administration will earn an equally 
descriptive aphorism: The art that obscures the obvious. 

| Тће foreword, as quoted, is a disclaimer of responsibility under the 
guise of policy. The text is almost completely wanting in substance. 
| When the editor of Tue Journan or Аррілер Рвусногову trans- 

mitted the volume, he suggested that a review might be worth 500 

words; he did пој mention, however, that the book is now two years old 

and is most attractively bound. 

Contributors to Parts I and II, placement and probation in the public 

service, are well known. Many of them are personal friends of the 

reviewer. But this fact does not deter him at this juncture from express 


Book Reviews 571 


ing a blunt opinion. Nor does it keep him from offering his friends and 
colleagues such sanctuary from their reluctant contributions as may be 
found in a paraphrase of Marcus Antonius: "So are you all, all honorable 
men." ` 

Placement and probation in the public service, as elsewhere, are still 
two unscaled peaks. And the present text adds little to the hope for an 
ascent. It does not recite the experiences of the able climbers who have 
attempted the heights; it is not a view by voyagers who, far inland, have 
sighted the sea. Rather, it is a glimpse downward of conventional 
apologists outdoing convention apologetically. It is а product of the 
committee method, and a warning to those who live, and hope, and have 
their being in that method. 


Fred 5. Beers 
Federal Security Agency 


Leeper, Robert. Psychology of personality. Ann Arbor: Edwards 
Brothers, 1946. Pp. 167. $2.00. 


Were I teaching elementary general psychology, this little book would 
be my text for the first unit. It is interestingly written about things of 
vital concern to students, yet it does not evade difficult concepts nor 
seriously oversimplify them. The author’s general theory of personality 
is not obtruded or defended before the court of his professional peers— 
as a matter of fact it seems to be a rather eclectic theory of no great 
novelty—but the treatment has that systematic form which proceeds 
only from a basic theory. Inferior students will get from this book more 
than the usual number of ad hoc generalizations which they can use; and 
superior students will, I believe, deepen their understanding of human 
behavior, 


Horace B. English 
The Ohio State University 


Missiuro, Włodzimierz. Znuzenie О fizjologicznych podstawach rac- 
jonalizacji pracy. (Fatigue: Physiological bases for scientific man- 
agement.) Warsaw: KsigkZa, 1946. Рр. 255. 

The author is a professor of applied physiology at Warsaw University 
but he wrote the book in Scotland during the latter part of World War 

1, а fact reflected in а large number of references to British investiga- 

tions. The author acknowledged his intellectual debt also to the Har- ' 

vard Fatigue Laboratory (now defunct). The third important source of 

Material for the book was the experimental studies carried out by thg 

author and his collaborators, grouped around the journal Przeglad 

Pizjologji Ruchu (Review of Physiology of Activity) published up to 1939 

arsaw, 


572 Book Reviews 


Three types of fatigue are distinguished: fatigue produced by physic 
activity, associated with mental effort, and related to emotional strain, 
The author focussed his attention on fatigue resulting from muscular 
work. Large space was devoted to biophysical and biochemical phenom: 
ena of fatigue observed in isolated muscle. The changes in fun i 
characteristics of the intact organism, taking place in the course 


Furthermore, fatigue of the industrial worker was regarded as b 
related not only to the work itself, its duration and intensity, and 
physical environment but also to the social and economie factors. 
In occupational fatigue there is frequently a distinct lack of cha 
measurable in terms of classical physiological and biochemical varial 
In addition to the fatigue of the effect or system (muscle as contra 
structure) and of the cardio-respiratory and biochemical changes, 
tention was paid to the fatigue of the apparatus of excitation and сој 
duction; the study of the fatigue of the central nervous system is one of 
the central tasks of industrial physiology. b 
In terms of the intensity of its manifestations and the speed of re 
covery, fatigue was classified as subacute, acute, and chronic. In acute 
. fatigue there is a reduction of the capacity to continue the activity ог, 
if the same level of performance is maintained, it can be done only at the 
cost of an increased effort. Chronic fatigue is regarded as a pathologie 
condition, reflected i in diminished vitality and resistance to disease. 


length et the work day, placement and duration of rest-pauses are со 
sidered only briefly. The text is somewhat uneven in difficulty. In 
contemplated English edition the presentation of all elementary informa- | 
tion should be eliminated, pitching the monograph at a moderately 
_ advanced level of physiological and biochemical training. As it is- 
practically impossible to discuss thoroughly in the same book both the 
theoretical background and the problems of industrial fatigue, concen= 
tration on the former would appear advisable. This would allow an in- 
crease in the amount of documentary material. With these modific 
tions an English edition would represent a welcome contribution tot the 
science of human work. | 
У и Josef Brozek 
University of Minnesota 


New Books, Monographs, and Pamphlets 


Books, monographs, and pamphlets for listing and possible review should be sent to 
Donald G. Paterson, Editor, Department of Psychology, University 
of Minnesota, Minneapolis 14, Minnesota 


Management and the psychologist, Sec. II, Book 4, Paul S. Achilles; 
Reading course in executive technique. Carl Heyel, Editor. New 
· York: Funk and Wagnalls Co., 1948. Pp. 64. $1.00 


Methods of psychology. T. G. Andrews, Editor. New York: John Wiley 
and Sons, Inc., 1948. Pp. 716. $5.00. 


Psychology for pastor and people. John 8. Bonnell. New York: Harper 
and Brothers, 1948. Pp. 225. $2.50. ; 

American opinion on world affairs in the atomic age. Leonard S. Cottrell, 
Jr. and Sylvia Eberhart. Princeton: Princeton University Press, 
1948. Pp. 152. $2.50. 

Ап introduction to color. Ralph M. Evans. New York: John Wiley and 
Sons, Inc., 1948. Рр. 340. $6.00. 

Conference leader’s guide. Bulletin No. 15. Waldo E. Fisher. Pasa- 
dena: California Institute of Technology, 1948. Pp. 28. $1.00. 

Speech and voice correction. Emil Froeschels. New York: The Philo- 
sophical Library, 1948. Рр. 321. $6.00. 

Two-way street. Eric F. Goldman. Boston: Bellman Publishing Co., 
Inc. 1948. Рр. 23. $1.25. 

Child care and guidance. Helen C. Goodspeed, Esther R. Mason, and 
Elizabeth L. Woods. Philadelphia: J. B. Lippincott Co., 1948. Pp. 
276. $2.40. 

Take up thy bed and walk. David Hinshaw. New York: Institute for 
the Crippled and Disabled, 1948. Pp. 262. $2.75. 

Under the ancestors’ shadow. Francis L. К. Hsu. New York: Columbia 
University Press, 1948. Рр. 317. $3.75. 

The nervous child. A symposium. Volume 7. Leo Kanner, Editor. 
New York: Child Care Publications, 1948. $5.00. 

The Kelley statistical tables. Revised edition. Truman L. Kelley. 
Cambridge: Harvard University Press, 1948. $5.00. 

The law of adoption in all 48 states. Morton L. Leavy. New York: 
Oceana Publications, Inc., 1948. Рр. 76. $1.00. 

Resolving social conflicts. Kurt Lewin. New York: Harper and Brothers, 
1948. Pp. 230. $3.50. Si 

N отодтарћу. Alexander S. Levens. New York: John Wiley and Sons, 
Inc. 1948. Pp.176. $3.00. и 

573 


574 New Books, Monographs, and Pamphlets 


The commonsense psychiatry of Dr. Adolf Meyer. Alfred Lief, Editor, 
New York: McGraw-Hill Book Co., Inc., 1948. Рр. 675. $6.50. 

A greater generation. Ernest M. Ligon. New York: The Macmi 
Co., 1948. Pp.157. $2.50. 

A laboratory manual in general experimental psychology. Norman L, 
Munn. Boston: Houghton Mifflin Co., 1948. Рр. 224. $2.50. 

The selection and use of diagnostic categories in clinical counseling. Harold 
B. Pepinsky. California: Stanford University Press, 1948. 
140. $2.00. А 

The psychiatric study of Jesus. Albert Schweitzer. Boston: The Beacon 
Press, 1948. Рр. 81. $2.00. ` 

Psychology of personality. Second edition. Ross Stagner. New York: 
McGraw-Hill Book Co., Inc., 1948. Pp. 485. $5.00. 

The psychology of abnormal behavior. Louis P. Thorpe and Barney 
New York: The Ronald Press Co., 1948. Рр. 877. $6.00. 
Training and selection of supervisory personnel in the I. G. Fc »enweri 

Ludwigshafen. Morris S. Viteles and Dewey L. Andersou. W: 
ington, D. C.: U. S. Department of Commerce, 1947. Рр.35. $1.00. 
The adolescent child. W. D. Wall. London: Methuen and Co., Ltd, 
1948. Рр. 206. 88. 6d. 
The abnormal personality. Robert W. White. New York: The Ronald 
Press Co., 1948. Pp. 613. $5.00. 
Measuring and guiding individual growth. Ben D. Wood and Ralph 
Haefner. New York: Silver Burdett, 1948. Рр. 535. ^ 
Contemporary schools of psychology. Revised edition. Robert 8. 
Turn: New York: The Ronald Press Co., 1948. Pp. 279. 
Local labor market research. Dale Yoder, Donald G. Paterson, et 
Minneapolis: University of Minnesota Press, 1948. Pp. 226. $3. 
Collective bargaining in the office. Research Report No. 12. New York: 
American Management Association, 1948. Pp. 120. $5.00. 
Survey of personnel practices in unionized offices. Research Report No. 


E д York: American Management Association, 1948. Рр. 88. 


Journal of Applied Psychology 


Vol. 32, No. 6 December, 1948 


The Effectiveness of Intelligence Tests in the 
Selection of Workers 


Edwin E. Ghiselli and Clarence W. Brown 
University of California 


Of the various types of tests that have been used in the selection of 
workers, undoubtedly intelligence tests in their varied forms have been 
most frequently utilized. To some extent this assertion is borne out by 
the number of reports of the effectiveness of intelligence tests in the selec- 
tion of workers which appear in the professional literature. Indeed, the 
authors have been able to locate some 185 instances of reports of the 
effectiveness of these tests in the selection of employees in various occupa- 
tions. These instances include only those investigations where results 
Were given in some statistical fashion indicating the degree of relationship 
between the test scores and some index of proficiency on the job. И one 
considers the additional number of investigations dealing with the rela- 
tionship between intelligence test scores and success in industrial training, 
labor turnover, and the like, it is apparent that this number is an under- 
estimate. 

To the above also must be added the unreported, but undoubtedly 
humerous, instances where intelligence tests have been used for employ- 
ment purposes without any verification whatsoever of their effectiveness. 
One finds both in industry and in governmental agencies the indiscrimi- 
hate use of intelligence tests for the selection of workers at nearly every 
Occupational level. The apparent reasoning in the blind use of these 
tests is that while they may not prove to be effective in selecting better 
Workers for the specific job under investigation, at least they will do no 

arm. Stated another way, in any employment situation the relation- 
ship between intelligence test scores and job proficiency can be expected 
to be either positive or zero but will not in any statistical sense be signifi- 
cantly negative. 

It is obvious that this notion needs empirical verification before it can* 
be accepted. If, for example, it were found that the relationship between 
Intelligence test scores and measures of job success were invariably either 

575 


576 Edwin E. Ghiselli and Clarence W. Brown 


positive or insignificantly different from zero, then the use of these tests 
under such a point of view would receive some justification. On the 
other hand, if significant negative relationships were frequently found 
then the notion could be considered of doubtful application. It would, 
of course, be preferred that all indiscriminate use of tests cease, but this 
state of affairs, if ever attained, will require many years of education in 
scientific test principles. In the meantime we should learn what past 
applications will reveal concerning the value and limitations of any test. 


Basie Data and Methods 


To examine the hypothesis that intelligence tests will give either posi- 
tive or zero predictions of occupational success, but will not give negative 
predictions, the authors surveyed the various professional journals and 
books for instances wherein intelligence test scores had been checked 
against job proficiency measures. In addition, they obtained instances 
from unpublished data collected by several industrial organizations. 

The correlations sought and employed in this analysis were between 
test scores and some index of proficiency on the job. The criteria against 
which the tests were validated included ratings of proficiency by super- 
visors, actual production figures, or some similar measure indicative of 
job proficiency. Relationships between test scores and success in train- 
ing, such as apprentice training, were not included. Similarly, valida- 
tions using as criteria the number of inservice promotions or raises were 
not considered. Job tenure as a criterion was accepted for inclusion only 
when the employees in the separated group were released because of 
failure to perform adequately on the job. 

In all, 185 validity coefficients were collected. These coefficients 
were segregated according to the occupational groups of the workers from 
which they were obtained. To test the statistical significance of the 
departure of these validity coefficients from zero, each one was transmuted 
into an equivalent Fisher's 2” and then divided by the standard error of 
a z’ of zero value for the same number of cases. The approximate formula 
for the standard error of z', the reciprocal of the square root of N-3, 
was used. 
tests for various occupational groups together with the number of coeffi- 
cients on which the median was computed. It is clear that for certain 
of these oceupational groups, namely, clerical workers, sales clerks, semi- 
skilled workers, and unskilled workers, the data are sufficient to draw 
rather definite conclusions. For clerical workers the relationship be- 
tween test scores and job proficiency is only moderately high, being .39- 
In most cases a validity coefficient of this order would be considered 28 
minimally acceptable for a single test. For sales clerks and semiskilled 


Intelligence Tests in the Selection ој Workers 577 


Results 


In Table 1 are presented the median validity coefficients of intelligence 
and unskilled workers the median coefficients are too low to indicate any 
general usefulness of intelligence tests in these occupations. With sales 
clerks there is even a strong indication of a negative relation. In the 
case of supervisors and skilled workers there is a suggestion that intelli- 
gence tests might be helpful in selecting workers in these groups. 


Table 1 


Median Validity Coefficients of Intelligence Tests for Various Occupational 
Groups in the Prediction of Job Proficiency 


Median Number of 

Validity Validity 
Occupational Group Coefficient. Coefficients 
Clerical workers 35 85 
Supervisors 40 9 
Salesmen .38 4 
Sales clerks - —.09 18 
Protective service 25 6 
Skilled workers 55 6 
-Semiskilled workers 20 45 
Unskilled workers .08 13 


However indicative these median validity coefficients might be, they 
are at, best only a rough approximation to the predictive effectiveness 
that intelligence tests might have in any particular employment situation. 
Furthermore, the findings from different reported studies with respect to 
the effectiveness of a particular type of test for the selection of workers 
for a given job are only infrequently in close agreement. Rather, the 
validity coefficients reported by different research workers for the same 
type of test and job are likely to vary over à considerable range of values. 
Figure 1 presents a graphic picture of the distribution of validity coeffi- 
cients for intelligence tests for the different occupational groups. The 
marked variation in findings reported by different investigators is fully 
revealed by this figure. To some extent this variation can be attributed 
to differences in the types of intelligence tests used, to differences in the 
reliability of the criteria employed in validation, to differences in the 
homogeneity of occupational groups, and to sampling error. However, it 
Seems likely that much of the variation in effectiveness of the intelligence 
tests must be ascribed to differences in the demands and requirements 
set for a job which in different organizations would appear to be similar ine 
Name only. In other words, it would appear that the job and worker 
Specifications for a particular job vary to а marked extent from one estab- 


578 Edwin E. Ghiselli and Clarence W. Brown 


lishment to another. Thus, the fact that intelligence tests were found 
to be effective in selecting workers for а particular job in one or more 
establishments would give no assurance that they would be equally useful 
for a job with the same name in other establishments. "There is posed 


CLERICAL WORKERS 


250 -25 .00 +.25 50 : +75 
21 suPERVISORS 


50 -25 90 +25 +50 +75 
27 SALESMEN 


-50 mes .00 +.25 +50 +75 
SALES CLERKS 


-50 -.25 57500 7.25 *.50 +75 
21 PROTECTIVE SERVICE 


-50 -25 :00 +.25 +.50 +75 
21 skiLLED WORKERS 


Number of Validity Coefficients 
o 


-25 .00 “+25 Е; +,50 +75 
61 SEMISKILLED WORKERS, 


750 


-50 -25 90 


UNSKILLED WORKERS 


-50 -25 00 *2 
à Validity Coefficient 


*50 NEXT 


Fio. 1. Distributions of validity coefficients of intelligence tests for 
different occupational groups. 


Intelligence Tests in the Selection of Workers 579 


here the problem of getting more adequate job and worker analyses 
and specifications in the framing and describing of the validating criteria. 

In Figure 1 it will be seen for clerical workers, supervisors, and un- 
skilled workers the validity coefficients range from about zero to as high 
as .60 to .70. In the case of semiskilled workers, while some of the 
coefficients reach this top limit, a few are negative and reach as low as 
—35. With skilled workers the coefficients are generally high, and for 
salesmen and protective service workers the coefficients are fairly homo- 
geneous and аге of the order of .30. The one striking difference appears 
in the case of sales clerks. With this group thirteen of the eighteen 
validity coefficients are negative and range as low as —.60, the highest 
positive coefficient being .25. 


Table 2 
Significance of Validity Coefficients of Intelligence Tests for Various 
Occupational Groups 
ee 
Number of Validity Coefficients 
Positive and Not Negative and 
Significant at Si Vend Significant at 
erent 
Occupational 1 5 from 5% 1% 

Group т 157 Тего Level Level Total 
Clerical workers 48 п 25 84 
Supervisors 5 2 2 9 
Salesmen 3 1 4 
Sales clerks 1 11 1 5 18 
Protective service 2 4 6 
Skilled workers 5 1 6 
Semiskilled workers 20 1 23 1 45 
Unskilled workers 2 2 9 18 
All workers 83 21 74 2 5 185 


оне ИЗОФА а Аа 


А measure of the statistical significance of the validity coefficients of 
the intelligence tests for the various occupational groups is given in 
Table2. It will be noted from this table that for clerical workers, super- 
visors, salesmen, and skilled workers the majority of the coefficients are 


significantly different from zero in the direction of a positive relation- 
kers and unskilled workers the majority 


ship. For protective service wor 
of the coefficients are not significantly different from zero. In the case 
f the 45 validity coefficients ap- 


of semiskilled occupations only one o y coefficient 
proaches significance in the direction of a negative relationship while the 
remainder are about equally divided between a significant positive rela- 
tionship and insignificance from zero. For sales clerks a third of the 


580 Edwin E. Ghiselli and Clarence W. Brown 


18 coefficients are significant in the negative direction, five of these being 
significant at the 1% level. Only one coefficient is significant in a positive 
direction. 


Summary and Conclusions 


An overall picture is presented of the validity of intelligence tests for 
the selection of workers in eight occupational groups. Despite wide 
variations in the kinds of tests used, the excellence of the test administra- 
tions, the adequacy of the criteria, and similar factors, there still appear 
certain tendencies in the data analyzed which may prove useful in future 
utilization of intelligence tests in selection. Keeping the foregoing limita- 
tions in mind, together with the number of validity coefficients analyzed, 
the following statements seem justified: 


1. For clerical workers intelligence tests are very useful selection in- 
struments. 

2. For supervisors, salesmen and skilled workers, intelligence tests 
show high promise of being very useful instruments but further knowledge 
of their effectiveness is needed. 

3. For sales clerks and unskilled workers intelligence tests are of 
little service in selection. 

4. For semiskilled workers intelligence tests may prove of some value 
when used in combination with other tests but as single instruments they 
show little promise. 

5. For workers in the protective service intelligence tests may prove 
useful when combined with other tests but at present there is need of 
further data before a definite conclusion can be made. 

Received March 29, 1948. 


Evaluation of a Clerical Applicant Testing Program * 


William James Giese 
William James Giese, Ph.D., and Associates, Chicago 8, Illinois 


and 


Frances Weigle 
David C. Cook Publishing Company, Elgin, Illinois 


All employment testing programs should be critically and systemati- 
cally reviewed to learn if the tests are measuring the capacities, profi- 
ciencies, etc. which are significantly related to job success. Such a study 
will point out any needs for changes in the program or determine whether 
or not the program is worth continuing. 

At the David С. Cook Publishing Company the applicant load is 
relatively low. Employment tests, however, have been used as general 
interviewing aids. When sufficient data become available, standards or 
“local norms” in terms of test scores can be set up if there is a high rela- 
tionship between the test scores and employee desirability. Such in- 
formation will be especially useful when the number of applicants becomes 
more plentiful. 

Employment tests have been used by Cook’s for about 8 years. The 
company retained the services of a consulting firm for the purpose of 
installing psychological tests for the selection of clerical personnel in the 
early part of 1940. The Clerical Test D was given to most clerical 
employees and the test scores related to merit ratings. However, these 
early data on present personnel are not available. 

Since 1940, the Clerical Test D and the StenoGaugE have been given 
to nearly all of the applicants for clerical or stenographic positions. Other 
tests have been given to many of the employees, but only the two tests 
mentioned had sufficient data to permit an evaluation of their usefulness 
аз selection and placement aids. 

These tests were usually administered by the Personnel Manager from 
1940 to November 1943, and since November 1943 they have been ad- 
ministered by a personnel assistant who has an A.B. in Psychology or by a 
qualified consultant. = 

* The authors wish to express their appreciation to the David C. Cook Publishing 
Company for their interest and cooperation in making this article possible. 
581 


582 William James Giese and Frances Weigle 


Method 


To find out how useful these tests had been in selecting desirable 
personnel the authors investigated the relationship between scores made 
on the StenoGaugE and the Clericàl Test D at the time of employment 
and subsequent experience with the people as employees. 

Тћеу considered as possible criteria objective records of performance 
such as length of service, production, absenteeism, accidents, errors, and 
similar records. They also considered as criteria, systematic but non- 
objective records such as: merit ratings, willingness to rehire at time of 
termination, estimates of promotability and similar systematic but non- 
objective records. 

In deciding which of the available material just mentioned would be 
practical to use, the following standards were used: meaning in terms of 
final results, consisteney and probable accuracy of the records, number 
of employees involved, and accessibility of the data. 


Results 
From the data which were practical to use, the authors found the 
StenoGaugE to be helpful in measuring typing and spelling proficiency 
since the test relates positively to both supervisors’ ratings and super- 
visors' willingness to rehire. 
The correlation between scores on the StenoGaugE and supervisors’ 
ratings is .61 + 10. Figure 1 illustrates how well the StenoGaugE 


Rated in 
Upper 56% 
N=43 by Supers, 


r ху=61= ЛО 


Rated in 
lower 44% 
by Supers 


In Upper 58% 


Fic. 1. How the applicants’ scores on the StenoGaugE at the time of 
employment relate to supervisors’ rating after three months of service. 


Evaluation of Clerical Applicant Testing Program 583° 


identifies those applicants at the time of employment who will be rated 
high and those who will be rated low by their supervisors after 3 months 
on the job. 

Figure 2 shows that there is a positive relationship between the super- 
visors’ willingness to rehire employees who have terminated and the 
employee's score on the StenoGaugE at the time of employment. Figure 
2also shows a positive relationship between remaining with the company 


Scores of Employees who Have Terminoted who are Eligible for Rehire 


IN = 62) 
m 
2:2885258832322323 
аА: ооа, 
оо LO 


(М = 18) 


Scores of Employees who Hove Terminoted who are nof Eligible for Rehire 


2.3 223 
микоз ЕСЕТА ТТ RAW SCORE 
s3:$8RRERRRRAA 
the Company os of August 1947 


Ес. 2, Relationship between scores on the StenoGaugE and 


pr turnover (January 1940 to August 1947). 


ànd the score made on the StenoGaugE at the time of employment. Of 
` the people who have left the company, those with higher scores on the 
StenoGaugE tend to work longer before terminating although r between 
_length of service before termination and scores on the StenoGaugE is 
only .18 + .11 (SE for an т or .0). 


584 William James Giese and Frances Weigle 


From these relationships it was concluded that the StenoGaugE is 
doing a reasonably good job of measuring proficiencies which are crucial 
to job success in the typing positions. 

From the data which were practical to use, the authors found the re- 
lationship between Clerical Test D and supervisors' ratings of clerical 
employees to be not nearly as high as was the relationship between the 
StenoGaugE and the typists' ratings by their supervisors. The Pearson 
Product-Moment Correlation is .39 = .10 between supervisors’ ratings 
and test scores at the time of employment. Figure 3 illustrates, in 
graphic form, this relationship. 


Fic. 3. How the applicants’ scores on the Clerical Test D at the time of 
employment relate to supervisors’ rating after three months of service. 


No relationship was found between length of service of office workers. 
and score on Clerical Test D. The correlation is .0. 

With regards to turnover there is a low relationship between the 
iiti e тев to rehire а terminated employee апа the em- 
ployee's Clerical Test D score at the time of employment. Figure 4 
illustrates this relationship. ye 

From these relationships it was concluded that Clerical Test D is doing 
& poor job of measuring those capacities which are significant to job success 
in the general office. 


Summary and Recommendations 


On the basis of these findings the following recommendations were ` 
made to the David C. Cook Publishing Company: 


Evaluation of Clerical Applicant Testing Program 585 


1. The StenoGaugE should be continued to be used as an employment 
st for the typing jobs requiring as high а test score as the selection ratio 
permit. A revision of the scoring system should be considered which 
give the test more differentiation at the higher levels. 


Scores of Employees whe Hove Terminated whe ore BHigible fer Rehire 


L| 
| 
E 
ш 
cH 
||| 
ам 
шш | 
|| | 
||| 
Hu 
[|||] 
RAW SCORE 3! 25 41 $6 $5 35 65 90.75 go вино RAW SCORE 
IN = 34) 


Scores of Employees who Hove Terminated who ore not Eligible for Rehire 


37 36 ат аб 51 56 61 70 75 80 85 90 95 99 
RAW SCORE 35-49 45 so 55 60 65 71 76 81 86 9! % 100 RAW SCORE 


Scores of Employees who ore with the Company os of August 1947 


Fra. 4. Relationship between scores on the Clerical Test D and 
turnover (January 1940 to August 1947). 


- 2. The Clerical Test D should be dropped. An analysis of the 
Various jobs for which it was being used as a predictor revealed they were 
general clerical jobs, but were jobs which were more likely to be 
marily filing, comparing numbers, etc., or jobs which were primarily 
mputational. Apparently, the test does not measure reliably or validly 
уре of clerical ability. Furthermore, the scoring is rather involved 
d subject to error. z 

3. For the more strictly clerical jobs, an intelligence test and a clerical 


586 William James Giese and Frances Weigle 


4. For those office jobs which are primarily computational in nature, 
an intelligence test and an arithmetical proficiency test are recommended, 

5. Since employment testing is established and accepted it is recom- 
mended that it be expanded to all applicants at the hourly rated and 
nonexempt salary levels for those jobs which demand capacities that are 
practical to measure in the employment office. 'The increase in cost 
would be negligible and pertinent test information might direct and 
shorten interviewing time. 
Received May 15, 1948. 


Use of the “Group Situation Observation" Method in the 
Selection of Trainee Executives 


Ronald Taft 
The Institute of Industrial Management, Melbourne, Australia 


A recurrent problem in the planned programs for the selection and 
training of young executives is that of predicting the likely future de- 
velopment of the potential trainee while he is still a youth. This article 
describes the application of the group situation observation technique to 
this problem of selection. This technique was originally used in the 
German Army Selection procedures, and adopted (and adapted) by the 
British (3) and Australian Armies (4). Тһе U. 8. Army О. S. 5. also 
utilized the basic principles in connection with the selection of personnel 
for operations behind enemy lines (6). Since the conclusion of the War, 
it has been applied to the selection of trainee industrial foremen, managers 
and civil service administrators, mainly in Britain (1,2). The present 
report deals with the application of the technique to a group whose age is 
well below that of other reported uses (17 to 19 years). 

Тће position for which the candidates were being considered was that 
of trainee production executive, in a shoe factory with 200 employees. 
Two trainees were required. Because of the long-range nature of the 
training program, no exact definition of the traits required by these 
trainees was attempted, but the selectors were familiar with the factory 
and the approximate duties which would be required of the future 
executives. 

Procedure 

The screening procedure prior to the group observation sessions is 
given briefly to provide a background to the data available to the selectors. 

Written applications from 63 persons were received as a result of 
newspaper advertisements, and 13 of these were rejected without inter- 
view on educational grounds. The Managing Director of the Company 
then gave an orientation and screening interview to the remaining 
candidates, as a result of which 11 were rejected as “unsuitable types" 
and 5 withdrew their applications. Eleven failed to report for this 
Interview. 

. The remaining 23 applicants were then given а vocational guidance 
interview by the writer, at which time they were given the following tests: 
Ocational Interest Questionnaire; Personal Questionnaire “Т” (Hana- 


587 


588 Ronald Taft 


walt and Richardson); Oral and Written Directions (Adaptation of 
Army Alpha); H Test (short form) (Adaptation of Army Alpha); Speed 
and Accuracy (Minnesota Vocational Test for Clerical Workers); Space 
Form Perception (Australian Institute of Industrial Psychology); and 
Mechanical Comprehension (Bennett A A). This was followed by a 
half-hour interview. Two failed, however, to report for this interviews 

Seven more applicants were rejected at this stage on grounds of 
interest, temperament or ability, including all those with a score of less 
than the 60th centile on general population norms for the H test. 


Group Situation Examination 


The remaining 14 candidates were invited by mail to be present at the 
home of the Managing Director “to spend the day with him in connection 
with your application for employment." One failed to аб ла. The 
others were divided into three separate groups, of four or five, each group 
being arranged for either а Saturday or a Sunday. During the group 
situation they were under the observation of the Managing Director and 
the writer (henceforth referred to as the Psychologist), the latter control- 
ling the day's proceedings. 

The following program was observed. 


Step Time Period Activity 

pu 11.45 to 12 Introduction. 

2. 12 to 12.15 Personal History. 

8. 12.15 to 12.45 Game—“Who am I?" 

4. 1245 to 1.45 Lunch. 

5. 14540 3.30 Group Rorschach Test. 

6. 3.30 to 4.15 Leaderless Discussion. 

1; 4.15 to 4.30 Afternoon Tea. 

8. 430to 5 Problem Situation Discussion. 

9. 5 to 530 Personality Judgments of Self and other Candidates. 
10. 5.30 Closing Address by the Managing Director. 


1. Introduction. Candidates were welcomed and introduced to each 
other by the Managing Director, and a brief word on the procedure was 
given by the Psychologist. ‘They were asked to try to adopt an informal 
attitude and to refer to each other by their first names. They were 
warned that it is “impossible to beat the system,” so that it would be in 
their best interests to try to be natural right from the beginning rather 
than to bluff their way through. 

2. Personal History. Each candidate was asked in turn to “introduce 
yourself to the others by stating briefly your personal history.” No 
further instruction was given, and they were called on in order of айе, 

starting from the oldest. At the conclusion of these short outlines, the 


The “Group Situation Observation" Method 589 


eandidates were given an opportunity of asking questions about the 
others, but a total of only four questions was asked. 

This procedure was of some value in giving the candidates a brief 
outline of their colleagues’ background; and also in indicating which 
factors the candidates considered significant in their lives. However, 
there was a tendency to adopt the pattern followed by the first speaker, 
and it was necessary in evaluating the contributions to consider this 
factor. Thus credit was given to a fourth speaker who broke away from 
an unsatisfactory habit adopted by the prior speakers of speaking about 
their schools rather than themselves." 

Indications about the candidates obtained from this procedure mainly 
related to self-confidence, particularly while in a situation calculated to 
unsettle them; also their ability to select salient factors, and to follow 
an independent line. 

3. Game—“Who am I?" Candidates were then informed that they 
were to play a game commonly known as “Who am I?," or “Personalities.” 
They were instructed as follows: “One person is to leave the room, and 
the others are to imagine that they represent a well-known personality, 
either living or dead. The person leaving the room should be brought 
back and should endeavour to find out who the personality is, by asking 
each one of the others in turn a question the answer to which is either 
‘Yes’ or ‘No.’ You should keep on asking questions until you have 
narrowed down the field, and you are allowed only one guess. I will not 
give you any further instructions, and you should work out any other 
details yourselves, Continue with this game until each one of you has 
had a turn.” Whenever any questions are asked they were reminded 
that they were “оп their озуп.” 

This test appeared to be particularly useful as a means of introducing 
the group to the leaderless group situation, as on each occasion problems 
regarding the observance of the rules were raised. Information obtained 
from this session related to the ability of the candidates to get their 
opinions accepted, attitudes towards the observance of rules, flexibility, 
intelligence, concentration, reaction to frustration, impulsiveness (tend- 
ency to guess rather than analyse), persistence, sympathy with the 
difficulties met by others, extent of general knowledge, identification with 
famous people, and so on. For example, when the questioner made an 
incorrect, guess, it was useful to observe how the others responded to the 
tule that only one guess should be permitted. | 

4. Lunch. During lunch the Managing Director and the Psychologist 
endeavoured to take part in the conversation and to make the atmosphere 

_ Та referring to information obtained as а result of the various tests, the writer has in 
mind the notes made by the observers at the time, but no attempt was made to infer 
Particular characteristics from the one test only. 


590 Ronald Taft 


informal. Lunch commenced as а standing buffet to permit candidates 
who were attracted to each other to come together, and a “mental” note 
was made of their social and individual behaviour. 

5. Group Rorschach. This was not properly a group situation test, 
but was introduced at this stage of the selection procedure for convenience 
only. Also it was felt that doing this test would help to break down 
tension, by strengthening the feeling on the part of all the candidates that 
they were going through the same trial together. 

'The use made of the Rorschach interpretations was similar to the use 
made of the aptitude tests; that is, it was primarily a screening device, 
intended to cull out those with definite neurotic symptoms. In this 
respect one was rejected as too uncontrolled and one as too inhibited, the 
latter giving only eight responses. 

6. Leaderless Discussion. The candidates were seated in a circle, with 
the Managing Director and Psychologist at the side. They were told to 
regard the latter as “merely pieces of furniture," and that they were now 
to discuss any topic on which they might decide. No further instruetions 
were given. 

The group dynamics involved in the selection of the topic itself provided 
valuable material. This test was also useful for observing how the 
subject stands up to argument, whether he perseveres or shows resistance 
to persuasion and whether he becomes emotional. In two of the three 
groups a dominant person seemed to arise at this juncture, and an op- 
portunity was afforded for observing whether the form of domination was 
“autocratic” or “integrative” (in the sense used by Lewin). 

7. Problem Situation Discussion. This discussion differed from the 
previous one only in so far as it was more structured, that is, the group 
was given an actual assignment. The candidates were given the facts 
about the hours of work at the factory at which they had applied for the 
position, and were asked to report their recommendations back to the 
Managing Director on how they considered these hours should be altered 
to arrange a 40-hour week. (The factory was previously working à 
44-hour week.) 

This discussion again gave scope for observing tendencies in certain 
of the candidates to dominate their group. It also was revealing about 
the knowledge of the candidates as to the general situation in industry, 
and their attitude towards management and employees (this was con- 
sidered in conjunction with their previous experience and home back- 
ground). 

+ The main difference between the leaderless discussion and the problem 
situation discussion is that the former gives more scope for the individual 
to show his personality and ability qua individual, while the latter stresses 


The “Group Situation Observation” Method 591 


rather the individual as а member of а group, the members of which are 
all motivated towards the same end, that is finding the solution to the 
problem. 

8. Personality Judgments. The candidates were then instructed as 
follows: “It is an important part of the duties of a factory manager to be 
able to sum up other people and himself objectively, and if necessary, 
ruthlessly. You should now write а thumb-nail sketch of the other 
candidates and yourself, with particular regard to their personalities with 
respect to the position of trainee faetory executive. All of your reports 
will be anonymous and will not sway our judgment either against or in 
favour of any particular candidate.” They were seated at separate 
tables for this task in order to reduce any inhibitions that may have arisen 
from the close proximity of the persons being rated. 

The judgments, made varied considerably in quality, and revealed 
varying willingness to unmask personalities. The insight possessed by 
the candidates also appeared to vary considerably. 

9.- Report on the Prodeedings. In his closing address the Managing 
Director requested the candidates to forward to him by mail a report re- 
counting the proceedings of the day, and giving their impressions of what 
had occurred. These reports provided an indication of each candidate's | 
judgment, ability to write a report on factual occurrences, powers of 
observation, memory for details, and maturity in evaluating a situation. 


Evaluation of the Candidates 


At the conclusion of each day’s observations, the Managing Director 
and the Psychologist discussed and tentatively evaluated the candidates 
in terms of their suitability for the position in question. Following the 
practice used in the evaluation of О. 8. 8. candidates (5), they were not 
judged on their comparative levels on a number of traits, but they were 
discussed rather in terms of their weak and strong points as shown in the 
Various situational tests conducted during the day. 

When the Rorschach tests had been scored and the reports received 
from the candidates a final selection conference was held. All the avail- 
able information and reports on the candidates were considered, with 
Particular weight given to the group observation data, since the other 
data had already been used for screening purposes. 


Evaluation of the Procedure 
‚ A consideration of the validity of the group observation procedure 
involves two major questions: (a) How well does it predict the ultimate 
success of the candidates? and (b) Does it add anything to the predictive 
Power of the usual test battery plus interview? 


592 Ronald Taft 


It would be difficult to answer either of these questions in the absenee 
of criteria provided by long-range longitudinal studies. However, in 
respect to question (b) it may be of value to compare the rankings made 
by the Psychologist after the vocational guidance interview with the 
overall rankings made at the completion of the selection procedure. 
These are set out in Table 1. 


Table 1 
Showing Comparative Rankings of Candidates by the Psychologist 
After the Vocational Guidance Interview and the Overall Ranking 
After the Completion of the Selection Procedure 


Rank Order 

after 

Уое. Overall Change after 

Guid. Rank Group 

Candidate Interv. Order bserv 

A 2 1 +1 
в 1 2 = 
с 5 3 +2 
р 3 4 = 
Е 10 5 +5 
F 75 6 +15 
G 9 7 +2 
H 4 8 = 
I п 9 -2 
J 75 10 -2.5 
K 12 11 +1 
L 13 12 +1 
M 6 13 -7 


Candidates A and B—the selected candidates—would have been 
chosen as the first two choices without the group observation interview. 
However, there are significant changes in the position of the other 
candidates, and it is possible that such changes could have occurred in 
the case of candidates A and B. 

As far as the individual items of the group observation sessions ате 
concerned it is difficult to evaluate their separate contributions to the 
final result, as the day’s proceedings have been viewed as a unit which 
develops progressively. 


> 1. The group situation used in selection is so variant from the actual 


situation as to be worthless as a basis for drawing inferences, if not 
actually misleading. 


The “Group Situation Observation" Method 593 


It is pointed out however that there is sufficient correspondence 
| the "artificial" and the "actual" situations to expect similar 
samples of behaviour. For example, it would be expected that а candi- 
date whose logic deteriorated as а result of emotional involvement in the 
Jeaderless discussion would show similar reactions in the everyday rela- 
tionships with other factory executives. 

2. Inferences are not permissible from the ability of a candidate to 
lead the other candidates to his ability to lead a group of factory workers. 

This criticism is unavoidable in any form of selection excepting that of 
trial and error, and it is believed that the differences between the two 
social groups were constantly borne in mind by the observers. 

3. The group observation technique assumes consistency of behaviour 
from one situation to another (i.e. test-retest reliability) without regard to 
temporary moods or reactions to unusual circumstances. 

However, if the candidate shows up badly during the observation, it 
seems a reasonable assumption that there will be occasions on the job 
when he will do likewise. 


Viewpoints on the Procedure 


The Managing Director felt that the group observation procedure had 
given him an opportunity to participate fully in the selection procedure, 
and to obtain a preview of his potential employees’ behaviour. It had 
also eliminated much of the esoteric aura that has surrounded the work of 
the psychologist as seen by the layman. 

The reports submitted by the candidates showed that they too con- 
sidered the procedure a particularly just one, eight of the fourteen stating 
this explicitly. Several of them also revealed in their remarks signs of 
the self-clarification which has been noted by other writers on this subject. 
(This “‘self-clarification” can be compared to the insight which develops 
88 a result of participation in role-playing.) One typical remark was «T 
к always remember today as а day of enlightenment and experience 

my life." 


Summary 

The problem of developing future executives for industry frequently 
Tequires a planned program involving the selection of potential executives 
from amongst comparatively young and untried persons. The usual 
methods of psychologically testing and interviewing candidates are limited 
by the difficulty of inferring social behaviour traits (such as dominange, 
cooperativeness, ability to persuade, stability in the face of emotional 
Stress, sound judgment, etc.). 


594 Ronald Taft 


During World War II the Group Situation Observation Method was 
devised, mainly by the British Army psychologists, to meet this difficulty 
in the selection of officers, and the method is now being applied to the 
selection of industrial and administrative executives. An application of 
this technique has been described where the problem was to select two 
trainee factory executives for a small shoe factory. "The candidates were 
first screened by means of aptitude tests and an interview and were then 
divided into groups of four or five for observation. The full day's 
procedure included a personal introduction by each candidate, а group 
Rorschach Test, an unstructured and a structured discussion period, and 
personality ratings by each candidate of the others. 

Received June 4, 1948. 
References 
1. Bridges, H., and Isdell-Carpenter, R. Selection of management trainees, Industr. 
Welf. Person. Mgmt., 1947, 19, 177-180, 315. 
2. Frazer, J. M. New-type selection boards in industry. Occup. Psychol., 1947, 11, 
170-178. 
3. Garforth G, І. De La P. War officer selection boards. Occup. Psychol., 1945, 14, 
97-108. 
4. gieo, A. The principles and traits of leadership. J. abn. soc. Psychol., 1947, 42, 
-284. 
5. MacKinnon, Donald W. Some problems of assessment. Trans. N. Y. Acad. Scis 
1947, 9, 171-185 (original not seen, quoted in Psychol. Abstr., 1947, 21, 496). 
6. Murray, H. А. Assessment of the whole person. In Kelly, G. A., New methods in 
. Md., College Park, 1947 (original not seen, quoted in Psychol. 


Abstr., 1947, 21, 496). See also, OSS Assessment Staff, Assessment of men. 
New York: Rinehart & Co., Inc. 1948. Pp. 541. 


: Cautions Concerning the Use of the Taylor-Russell 
Tables in Employee Selection 


Max Smith 
The City College of New York 


importance of the selection ratio in determining the practical 
ss of tests in selection is pointed out in an article by Taylor 
ssell (2). In the article the authors also present tables for esti- 
ing the degree of such effectiveness when the validity scattergram rep- 
s a normal bivariate surface. With this type of correlation distri- 
when the use of a small selection ratio is feasible, even tests of low 
Ly may be highly effective in selecting better employees. The con- 
n, however, does not apply to every type of validity scattergram. 
ooking this fact may lead to unjustified reliance upon a small selec- 
ratio as a possible substitute for high validity in obtaining better 
‘It is the purpose of this article to present three considerations which 
help to avoid unwarranted reliance upon low validity coefficients or 
e of the Taylor-Russell tables: First, where a triangular scatter- 
is found between test scores and criterion scores, there may fre- 


g likelihood, the specific probabilities indicated in the Taylor- 
tables are quite inapplicable to triangular scattergrams. Third, 
n in the case where an approximately normal, elliptical scattergram 
exist, the Taylor-Russell tables as ordinarily used will often yield 


Limited Value of the Selection Ratio 

- with Triangular Scattergrams 

> Concerning the value of a low selection ratio, one widely used in- 
al psychology text says: 


. . in group testing, a reduction of the selection ratio is a substitute for 
idity. This statement . . . mean(s that if the test has any significant 
› however small, it is possible for the employer to get the same fune- 

ue from it that he could get from a test of any validity, however 
if he is able sufficiently to reduce the selection ratio” (3, p. 69). 


. The foregoing statement is theoretically true when the test scores 
the criterion scores yield a normal bivariate distribution. But often 
595 


596 Maz Smith 


in vocational prediction the correlation surface is definitely not normal, 
The scattergrams found are frequently triangular in shape, not elliptical 
(3, p. 80). Nevertheless, the author goes on to say: 


“А question may be raised as to whether the reasoning on pages emis 
in which the significance of the selection ratio concept was discussed, is vali 
when the scattergram is triangular rather than oval in shape. This reasoning 
applies equally well regardless of the shape of the scattergram as long as a 
vertical line drawn through the distribution at any point will divide the indi- 
viduals plotted so as to give a higher average criterion score to those on the 

ight of the line than to those on the left. This situation is certain to occur 

wi a positive correlation between test scores and criterion exists. Thus, the 
rather common existence of a triangular scattergram does not invalidate the impor- 
tance of the selection ratio" (Italics added) (3, p. 80). 


On the contrary, this article will attempt to show that “the rather 
common existence of a triangular scattergram"' does in certain circum- 
stances "invalidate the importance of the selection ratio," and that in 
general the Taylor-Russell tables are inapplicable to such distributions. 

The mere fact that a positive correlation between test scores and 
criterion exists is no guarantee whatsoever that a vertical line drawn 
through the distribution at any given point will divide the individuals 
plotted so as to give a higher average criterion score to those at the right 
of the line. Often the triangularity of а scattergram indicates a fairly 
high predictive value for low test scores, accompanied by negligible pre- 
dictive value for higher test scores. In such situations, reducing the 
selection ratio below 100 per cent will at first be more effective than the 
Taylor-Russell tables would lead us to expect. But after a certain selec- 
tion ratio has been reached, testing more individuals in order to reject а 
larger proportion of them will have very little practical effect. Further 
practical effectiveness in reducing the proportion of unsatisfactory 
workers will now require predictive measures of increased validity. 

The relation between visual acuity and production among gaugers 
furnishes an excellent example of such a situation: 


"Figure 59 shows . . . the type of relation between near acuity and pro- 
duction rating for 177 gaugers. The average rating increases with higher 
acuity scores, up to 7. Above that point the average rating does not vary 
systematically. The production averages of all groups scoring 7 or better 1n 
near acuity are higher than are those of any groups scoring below 7. On this 
bi де Pio tai d higher than 7 idi an prove the digorimins ®t 

he test. It is as though acuit; i is job, а 
better acuity is not needed” (3, ch 204). Mose to dote ion 


The application of the foregoing facts to the question of t 
ratio seems quite clear. "Testing the near acuity of enough/ applicants 
so that none with a score below 7 need be hired as a gauger wojuld increas? 
the effectiveness of selection. (Alternatively, perhaps, corrective lenses 


Cautions in the Use of Taylor-Russell Tables 597 


t be prescribed for those with lower scores.) Reducing the selection 

ratio beyond that point would have no additional value. Further refine- 
‘ment of selection would depend upon finding measures which would show 
some correlation with rated production among applicants whose visual 
acuity score was 7 or higher. 

In the above example the application of the Taylor-Russell tables 
"would obviously be unwarranted. If, however, a personnel executive 
been misled into believing that a triangular scattergram “does not 
validate the importance of the selection ratio,” he might wrongly place 
iance on the tabled expectancies instead of searching for more valid 
measures. То the extent that such reliance led to complacent satisfaction 
"with inadequate validity coefficients, it might perhaps be the very factor 
that prevented the possibility of real improvement in selection. 


Inapplicability of the Probability Estimates 
to Triangular Scattergrams 


Even when a triangular seattergram does represent a definitely in- 
creasing average criterion score accompanying increasing average test 
scores, the specific figures in the Taylor-Russell tables are not applicable. 
‘As Taylor and Russell (2, p. 571) explain, their tables are based on 
Pearson’s ‘Tables for Finding the Volumes of the Normal Bivariate Sur- 
face” and consequently assume a normal bivariate distribution. To the 
extent that a given distribution departs from this assumption the specific 
probabilities in the Taylor-Russell tables do not hold. This point is 
obvious and need only be affirmed to be recognized. For students, 
though, it should be made quite explicit in order to insure its not being 
overlooked. 

Statements such as “This reasoning applies equally well regardless of 
the shape of the scattergram . . .” and “. . . the rather common ex- 
istence of a triangular scattergram does not invalidate the importance 
of the selection ratio," however, not only fail to make the point; they 
might easily be interpreted by the average student as implying that the 
Taylor-Russell tables are applicable to triangular scattergrams. 

The use of the product-moment coefficient of correlation with tri- 
angular scattergrams is ordinarily inadvisable in the first place, since the 
assumption of rectilinearity is generally not fulfilled. To enter the 
- Taylor-Russell tables with such a coefficient and to make practical de- 
cisions concerning selection ratios on the basis of the figures one finds is 
likely to lead only to misdirected effort and to ultimate disappointment. 

In many practical examples throughout his Industrial Psychology, 
Tiffin does illustrate sensible, realistic procedures, which students can 
Profitably imitate. Despite his statement that “the rather common 


598 Maz Smith 


existence of a triangular scattergram does not invalidate the importance 
of the selection ratio," he does not in practice apply the Taylor-Russell 
tables to such distributions. Nevertheless, in view of his explanation 
of the use of these tables without cautioning concerning their possible 
misuse, it is not unlikely that some of his student readers may be misled. 


Use of the Tables with Elliptical Scattergrams 


When the test scores and the criterion scores yield an approximately 
normal bi-variate distribution (which will lead to an elliptical scatter- 
gram), the use of the Taylor-Russell tables is justified. But in-using 
the tables, we must remember that they assume the applicant group and 
the present employee group to be similarly constituted (2, p. 576). This 
is equivalent to assuming that at present one hundred per cent of appli- 
cants are being hired and retained or that our current selection procedures 
have zero validity or that both conditions prevail. 

The first assumption is almost certainly unwarranted asarule. What 
firm hires and keeps on the job every individual who applies? Мог is it 
always to be expected that the current selection procedures are completely 
worthless, though sometimes they may be. 

A more likely situation is that at present some applicants are being 
rejected for one reason or another and that the current selection procedure 
has some degree of validity in reducing the number of unsatisfactory 
employees. The number of applicants accepted and the number re- 
jected will probably be on record; so the present selection ratio is known. 

The validity coefficient of the current selection procedures may not be 
known, but it is nevertheless operant in reducing the proportion of un- 
satisfactory employees as compared with the proportion that would exist 
if no selection whatsoever were made among applicants. 

Let us use a hypothetical example to see by how much the prospective 
effectiveness of selection might be overestimated if the Taylor-Russell 
tables were used without correcting for the assumptions just indicated. 

We shall assume that at present 73 per cent of our employees are 
satisfactory according to our criterion. What proportion of new em- 
ployees would be satisfactory if we were to use selection procedures that 
yield a validity coefficient of .30 with our criterion and if we were able to 
limit our selection to the best 40 per cent of applicants according to our 
test standards? 

Using the tables without concerning ourselves about current validities 
or selection ratios, we conclude that we may expect a decrease in the pro- 
portion of unsatisfactory workers from .27 to .18, which means a reduc- 
tion of one-third in the number of unsatisfactory employees. 


Cautions in the Use of Taylor-Russell Tables 599 


If, however, we hypothesize also that in the total applicant group the 
actual validity coefficient of current selection procedures with our criterion 
is .20 and that our present selection ratio is .60, we find that we may 
expect a decrease in the proportion of unsatisfactory workers from .27 to 
.22, not to .18 as we had inferred from an uncritical use of the tables. In 
other words, we may expect a reduction of 19 per cent (5/27) in the 
number of unsatisfactory employees, not a reduction of one-third. 

The procedure for using the Taylor-Russell tables with due allowance 
for a known pre-existing validity coefficient and selection ratio is ex- 
plained in the following paragraphs set in smaller type. When, as is 
often the case, we don’t know the validity coefficient of our current pro- 
cedures, about all we can conclude is that the Taylor-Russell tables will 
overestimate the amount of gain—but we cannot say by how much. 


Using the Taylor-Russell Tables (2, 3 Appendix B) with Allowance for 
Pre-existing r and Selection Ratio 


For the sake of brevity we shall use the following terms and symbols: 
The proportion of present employees considered satisfactory, we shall call 
“OK.” The proportion who will be satisfactory among those selected, we 
shall term “new OK.” The validity coefficient is 7; and the selection ratio, SR. 

‚ We hypothesize OK = .73, r = .30, SR = .40. First, let us use the tables 
without concerning ourselves about pre-existing validities or selection ratios. 

Since there is no table for OK = .73, we shall interpolate! between the 
table for ОК = .70 and the table for OK = .80. In table OK = 10, for 
т = .30 and SR = .40, the new OK = .80; in table OK = .80, the correspond- 
ing new OK = .88. So by interpolating, we conclude that we may expect 
about 82 per cent of the new employees to be satisfactory instead of 73 per cent 
аз at present. In other words, we may expect a decrease in the proportion of 
unsatisfactory workers from .27 to .18, which means & reduction of one-third 
in the number of unsatisfactory employees. 

Now, in addition to hypothesizing OK = .73,r = .30, SR = .40, we assume 
also a pre-existing r of .20 and SR of .60. What new OK may we now expect? 

In this case the table to use would not be three-tenths of the way from 
OK = .70 to OK = .80, as it was when we assumed zero validity and/or а 
hundred per cent selection ratio. The correct table to use would be one in 
which a change from r = 0 or SR = 1.00 to r = .20, SR = .60 would have 
brought about a new OK of .73 (which is our hypothetical present OK). 
That is, the correct table to use would be one in which the entry is .73 at the 
point where the row т = .20 intersects the column SR = .60. However, there 
table table in which this occurs; so we shall have to interpolate between two 

ез. 

In table ОК = .60, at the point where row r = .20 intersects column 

R = .60 the entry is .65; in table OK = .70, the pomeo а entry is .75. 
Since .73 is eight-tenths of the way from .65 to .75, to find the new OK for 
other validities and selection ratios we shall interpolate at a point eight-tenths 
of the distance between the value found in table OK = .60 and that found in 
table OK = .70. For our hypothesized г = .30 and SR = ,40, the entry in 
table OK = .60 is .71; in table OK = .70, the corresponding entry is .80. 
The new OK we may expect then is about .78, not 82 as we had inferred from 
ап uncritical use of the tables. In other words we may expect a reduction of 


! Linear interpolation gives a sufficiently close approximation. 


600 Maz Smith 


19 per cent (5/27) in the number of unsatisfactory employees, not a reduction 
of one-third. 


'There is ап additional consideration which frequently interferes with 
accurate estimation from the Taylor-Russell tables, though this time in 
the direction of underestimation. The validity coefficient with which one 
is supposed to enter the tables is that prevailing among the entire range 
of applicants. But very seldom will criterion measures be available for 
all applicants. If (as the tables assume) the applicant group and the 
present employee group are similarly constituted, the validity coefficient 
among the present employee group may be substituted. But if (as is 
probably more often the case) the present employee group is more homo- 
geneous than the applicant group, the substituted r will ordinarily be 
smaller than the r among all applicants. And so we have an additional 
element of inaccuracy in using the Taylor-Russell tables. 


Comment 

In the present state of development of vocational selection the Taylor- 
Russell tables are of value primarily in drawing attention to the impor- 
tance of utilizing the selection-ratio concept rather than in furnishing 
dependable specific probabilities of expected improvement in effectiveness 
of selection. If comprehensive faetor studies like those conducted by 
the Army Air Forces (1) ultimately result in relatively pure tests of im- 
portant factors and if reliable criteria can be obtained, industrial psycholo- 
gists may perhaps be able to build up reasonably valid predictive equa- 
tions based on multiple indicators. Then they may be in a position to 
make effective use of theoretical distributions and tables like Taylor- 
Russell's. 

Meanwhile it would seem that, for vocational selection, the realistic 
way to make the best use of the relationships shown in а predictive всай- 
tergram is to compare analytically the distributions of the individual 
arrays within the scattergram, so that optimal critical scores can be set. 
(Statistical analysis of variance will help distinguish between chance 
and real significance.) Then, if possible, enough applicants should be 
tested so that there will be a sufficient number with scores above (07 
between) the critical points to fill all vacancies. 


Received April 5, 1948. 
References 


1. Guilford, J. P. Some lessons from aviation psychol ican Psychologist, 
1948, 3, 3-11. PES. rem im 


2. Taylor, H. C., and Russell, J. T. The relationship of validity coefficients to the 
practical effectiveness of tests in selection: discussion and tables. Ј. appl. 
Psychol., 1939, 23, 565-578. 


3. Tiffin, Joseph. Industrial psychology (2nd Ed.). New York: Prentice-Hall, 1947. 


Spatial Relations Ability and Other Characteristics of 
Art Laboratory Students 


William B. Dreffin and C. Gilbert Wrenn 
University of Minnesota 


One characteristic reaction of the veteran population in college and 
university today is the demand for vocational preparation. As a result 
of this need it has been the growing convietion of the University of 
Minnesota General College that the curriculum should offer not only a 
broad base of general education but also an increasing number of vocation- 
ally oriented sequences designed for sub-professional work (3). A con- 
clusion of this survey of Minnesota's educational needs was that in many 
vocational areas there are five sub-professional jobs to each profes- 
sional one. 

This paper is concerned with the selection phase of a proposed voca- 
tional sequence in commercial art. It is an attempt to determine some 
of the criteria that could be used in the selection of students for such a 
training sequence. 

It is believed that some of these criteria may be sought in the spatial 
relations ability of art laboratory students, in their scholastic aptitude, 
and in their vocational interest profiles. Barrett (1) found that the 
Revised Minnesota Paper Form Board Test, Series BB, the Strong 
Vocational Interest Blank, women’s form, the Meier Art Judgment Test, 
and the Allport-Vernon Scale of Values all discriminated between art 
Majors and non-art students at Hunter College. These same tests are 
suggested for the identification of art ability in the article on that subject 
in Kaplan’s Encyclopedia (2, pp. 59-63). Welch found that a test of 
creative thinking differentiated sharply between professional artists and 
college students (5). Maturity may have been a factor here. 

The following tests and inventories, regularly administered to all 
General College freshmen, were available for evaluation: ACE Psycholo- 
gical Examination, 1937 Form; Ohio State University Psychological Test, 
Form 22; and the Strong Vocational Interest Blank, both men’s and 
Women’s forms, To determine spatial relations ability the Revised 
Minnesota Paper Form Board Test, Series MA, was administered to all 
art laboratory students. $ 

The sample studied consisted of 69 art stndents in the art laboratory 
Course of General College (43 men, 26 women) who had а definite voca- 


601 


602 William B. Dreffin and C. Gilbert Wrenn 


tional objective in commercial art. Specific vocational goals included 
within this broad area were architecture, interior decorating, designing, 
illustrative advertising, and the teaching of art. Because the manual 
of the Paper Form Test does not give norms for commercial artists о 
architects the authors administered this test to a group of advanced 
architectural students and to a group of advanced commercial art 
students (Art Education) in the University of Minnesota. This was 
done upon the assumption that spatial relations ability is one factor im 
the aptitude for art and that these students would possess this ability to 
a relatively larger degree than corresponding individuals in other curricula 
or in the general population. 


Spatial Relations Ability 


Table 1 summarizes the basic statistics of the groups ғ: Лед witl 
regard to spatial relations ability. 


Table 1 


Means and Standard Deviations of the Specific Groups Studied and of the Published 
Norm Groups on the Revised Minnesota Paper Form | 


Board Test, Series MA 
Group N Mean S.D. 
This Study— 
Art Laboratory Students 69 45 7.2 
Commercial Art Students 29 46 4.3 
Architectural Students 27 51 7.0 
Published Norms (original form)— 
General Population 100 31 11.5 
Liberal Arts Freshmen 247 38 8.5 
Fifth Year Engineers 238 46 8.0 


Table 2 summarizes the significance of the differences, in terms ОЁ 
variances and means, between the art laboratory students and liberal 
arts freshmen, commercial art students, and architectural students. 

As a group, art laboratory students are significantly superior to the 
average liberal arts freshman, are not significantly different from the 
average commercial art student, and are inferior to the average architec- 
tural student in spatial relations ability. 
; The authors were not only interested in determining the spatial rela- 
tions ability of art laboratory students, but also in ascertaining whethi 
there was a significant relationship of the spatial relations ability to 
achievement and to probable professional success. Each student in 
art laboratory was assigned two grades at the end of the quarter in which 

this study took place. One was a measure of his art laboratory achieve- 


Spatial Relations Ability of Art Students 603 


Table 2 
Significance of the Difference in Terms of Variances and Means between the Art 
Laboratory Students and Liberal Arts Freshmen, Commercial Art 
Students, and Architectural Students, Respectively 


*4" or “d” Test of 
“FP” Test of Variances Means between 
between Art Lab- Art Laboratory 
ога Students Students and 
Group N and Groups Other Groups 
Liberal Arts Freshmen 247 Е = 138 P>.05 t-722 Р< 01 
Commercial Art Students 29 Е=>28 Р< 01 d= 69 P».05 
Architectural Students 27 F=239 P< 1 d=229 Р< 1 


When P < .01 = significant difference at the one per cent level of confidence. 
When P > .05 = non-significant difference at the five per cent level of confidence. 
"m beer Ae Doce emo Cim ow ли 


ment and the other was a measure of the instructor’s considered judgment 
of that student's potential suecess in a commercial art vocation. Each 
student worked independently during the quarter in the area in which 
he felt himself most deficient and the instructor's course grade was in 
terms of individual improvement in this area. Because of this it might 
be expected that the correlation between spatial relations ability and 
the grades in art laboratory would be lower than the correlation of spatial 
relations with the instruetor's judgment of the student's probable pro- 
fessional success. This was not the case, however, for the two coefficients 
were found to be .37 and .36 respectively, both significant at the five 
per cent level. 


Scholastic Aptitude 


It was found that art laboratory students as a group are average in 
scholastic aptitude when compared to General College students as a whole, 
but, in terms of all university freshmen, they are considerably below 
average. Their mean score on the ACE Psychological Exam. and on 
the Ohio State Intelligence Test placed them at the 45th and 50th per- 
centiles respectively in terms of General College norms, and approxi- 
mately at the 4th and 16th percentiles respectively in terms of university 
freshmen. This places them at the average of the general population in 
intellectual ability as measured by these tests. 


Measured Interests 


When the interest profiles of the art laboratory students on the Strong 
Vocational Interest Blank were analyzed by sex it was found: g 

1. That 12 per cent of the men scored B+ or A on the Artist scale 
and 14 per cent on the Architecture scale. 


604 William B. Dreffin and C. Gilbert Wrenn 


2. That 66 per cent of the men scored B+ or A on one or more of the 
business contact scales. 

3. That 60 per cent of the men scoring B+ or A on one or more of the 
business contact scales scored B— or lower on one or both of the Artist 
and Architecture scales. 

4. That 46 per cent of the women scored B + ог А on the Artist scale 
of the women's form. 


Strong (4, pp. 685, 699, 716) reports that 48 per cent of his male artist 
norm group were commercial artists, and he found negative correlations 
between the interests of male artists and the interests of men scoring 
high in the business contact area (r's from —.27 to —.52). These results 
conflict with the interest scores of the male art laboratory students and 
there appear to be two possible interpretations: 


1. The art laboratory students are younger than the commercial 
artists of Strong's group and as they become older their interests may 
develop in the direction of the commercial artist pattern. This is not 
likely in view of Strong’s work on changes of interest with age. 

2. Art laboratory students, because of the low artist and high business 
contact interest scores, and because of relatively low scholastic aptitude, 
may enter commercial art but at а low professional level. Of course the 
two explanations are not mutually exclusive. They may become com- 
mercial artists but be most occupied with the business end of the vocation. 


Almost one-half of the art laboratory women of the General College 
sample secured A or B+ ratings in the Artist scale of the women’s form 
of Strong’s Inventory. These women had commercial art as an objective 
but one-half had the interest pattern of women artists in general. Only 
20 per cent of the norm group, according to Strong, were commercial 
artists. Their interest patterns are suggestive of art as a vocation (insofar 


as interests are concerned) but the number who will find commercial art 
their field is left in question. 


Commercial Art Curriculum 


There are certain implications for the curriculum to be noted here- 
Art laboratory students in the General College at the University of Min- 
nesota as a group have spatial relations ability equal to that of advanced 
commercial art students. Furthermore, when art laboratory students 
from the General College take advanced work in Art Education (com- 
mercial and teaching fields) they are reported by their instructors to 
achieve satisfactorily in art courses. Asa group, however, art laboratory 
students have difficulty in competing with commercial art students in 
other academic fields. The reason for this is apparent in their compara 


Spatial Relations Ability of Art Students 605 


tively low scholastic aptitude test scores. The lack of measured ap- 
propriate interests among the men may also be a factor. 

It would appear feasible, therefore, to set up a commercial art sequence 
in the General College for students possessing spatial relations ability 
well above the average of liberal arts freshmen, with scholastic aptitude 
at about the median or above of General College students, and who have 
a definite vocational goal in commercial art. With regard to measured 
interests the findings of this study are less definitive. Two-thirds of the 
men art students have measured interests similar to the interests of men 
in business contact vocations; approximately one-half of the women art 
students, however, have measured interests similar to women artists. 
Both groups have reasonable prospects of entering the commercial art 
field at some undetermined level. 


Received April 9, 1948. 


References 


1. Barrett, D. M. Aptitude and interest patterns of art majors in a liberal arts college. 
J. appl. Psychol., 1945, 29, 483-492. 

2. Kaplan, O. J. (Ed.). Encyclopedia of vocational guidance. New York: Philosophical 
Library, 1948. : 

3. Statewide Committee on Higher Education. Unfinished business—M innesota's needs 
in higher education. Minneapolis: University of Minnesota Press, February, 
1947. 

4. Strong, E. K., Jr. Vocational interests of men and women. Stanford University 
Press, 1934. 

5. Welch, L. Recombination of ideas in creative thinking. J. appl. Psychol., 1946, 
30, 638-643. 


Notes on the Validity of the Grove Modification of the 
Kent-Shakow Industrial Formboard Series 


Ruth C. Wylie 
Connecticut College 


АП the published data on the Modified Kent-Shakow Formboard 
Series have concerned reliability, discriminative capacity and norms 
rather than the validity of the test.^*** The Scovill Manufacturing 
Company and the Cincinnati Employment Center reported that the 
original Kent-Shakow Series was useful in selection of toolmaker appren- 
tices and in discriminating among occupational groups.5* Unpublished 
data made available to the writer by Grove indicate that the Modified 
Formboard Series was found to discriminate sharply between a carefully 
selected group of graduate engineers and a group of college men matched 
for ACE scores but lacking in mechanical interests or experience. How- 
ever, no reports have been published on attempts-to make predictions of 
vocational success with the Modified Formboard Series. Nor have cor- 
relations of this test with other more commonly used tests of mechanical 
ability been reported. 


Procedure and Results: Job Prediction Study 


Fifty-five men who were entering the Diesel Engine School of the - 


United States Submarine Base at New London, Connecticut were tested 
with the Grove Modification of the Kent-Shakow Industrial Formboard 
Series and the Spatial Visualization Factor Test from Thurstone’s Tests 
of Primary Abilities. "These men had been chosen f. or this special training 
partly on the basis of the Navy Mechanical Aptitude Test and General 
Classification Test. They were a highly selected group, as inspection 
of Table 1 will reveal. 
1 Grove, W. R. Modification of the К ies. J. Psychol, 
re BERN е Kent-Shakow Formboard Series. ву 
2 Wylie, R.C. The reliability of the Grove Modification of the Kent-Shakow Form- 
board Series. J. appl. Psychol., 1947, 31, 155-159. 
* Wylie, R. C., Wilson, A. W., and Grove, W. R. High school norms for the Grove 
ae of the Kent-Shakow Formboard Series. J. appl. Psychol., 1948, si 
* Wylie, R. C. The performance of girls and women on the Grove Modification of 
the Kent-Shakow Formboard Series. J, Psychol., 1948, 25, 99-103. 
* Bingham, W. V. Aptitudes and aptitude testing. New York: Harper, 1937, p. 1881, 
* Paterson, D. G., Schneidler, С. G., and Williamson, E. G. Student guidance 
niques. New York: McGraw-Hill, 1938, pp. 229-233, 


606 


Kent-Shakow Industrial Formboard Series 607 


Table 1 
Navy Mechanical Aptitude Test Scores and General Classification Test Scores for 
45* Men Entering Diesel Engine School at the United States 
Submarine Base, New London, Conn. 


Navy Mechanical Aptitude General Classification 
Test Test 
Score** Score** 

Range Rating*** N Range Rating*** N 
9 пос ЕЛЕНЕ су. чул, смес =. 
65-71 High 12 65-77 High 11 
55—64 Ађоуе 27 55-64 Above 25 

average average 
45-54 Average 4 45-54 Average 9 
35-44 Below 1 35-44 Below 0 
average average 
13-34 Very 1 21-34 Very 0 
low low 


De 0 rigs LEES 
* Of the 46 men who graduated, records on Navy Mechanical Aptitude Test and 
General Classification Test were available for only 45. 
** Navy Standard Scores (Mean — 50; S.D. — 10). 
*** Ratings assigned to Navy Standard Scores on Navy Norm Sheets. 


Forty-six of the fifty-five entering candidates graduated from the 14- 
week course. Nine of the original group were transferred to other special 
schools or duties and did not graduate from Diesel Engine School. As 
far as can be ascertained, their transfer was occasioned by reasons other 
than likelihood of unsatisfactory performance in Diesel Engine School. 
Therefore their records were eliminated from the study rather than being 
included as failures. 

A final grade for each of the 46 graduates was made available to the 
writer. This final grade was an average of partial grades given for 
theoretical and laboratory work in such subjects as: General Motors and 
Fairbanks Morse Engines, Hydraulics, Refrigeration and Air Condi- 
tioning Equipment. 

The correlation between Navy Mechanical Aptitude Test Scores and 
final grades was +.58--.07. The correlation between the General Clas- 
sification Test and final grades was +.18+.09. The Modified Form- 
board Series correlated with final grades +.47=:.07. 1 

Thus it is seen that the Modified Formboard Series predicted Diesel 
Engine School grades with a degree of efficiency not statistically signifi- 
cantly different from the Navy Mechanical Aptitude Test. (Critical 
ratio for the obtained difference =.4.) The Modified Formboard Series 
Predicts Diesel Engine School grades somewhat more efficiently than 
does the General Classification Test. The critical ratio for the obtained 
difference was 1.6. 


608 Ruth С. Wylie 


It seems reasonable to expect that the correlations for both the Navy 
Mechanical Aptitude Test and the Modified Formboard Series would be 
higher if a less highly selected group of subjects had been used. The 
range of talent for our group was restricted, not only on the Navy Me- 
chanical Aptitude Test, but also on the Modified Formboard Series. The 
median submariner's score on the Modified Formboard Series fell at the 
90th percentile of the high school standardization group; and only five 
of the submariners had formboard scores less than the 50th percentile 
of the high school standardization group. These five scores had the 
following percentile ranks based on the high school norms: 45, 45, 40, 
40, 10. 


Table 2 
Correlations Obtained between the Grove Modification of the Kent-Shakow Formboard 
Series and other More Commonly Used Tests which Apparently Involve 
Mechanical Ability and the Spatial Visualization Factor 


Test with which 
Formboard Series 
was Correlated N Group r 
Navy Mechanical 45 Submariners 55 
Aptitude Test entering Diesel 
. Engine School 
"Thurstone Space 55 Submariners 49 
Factor Test entering Diesel 
Engine School 
Minnesota Paper 215 Boys, grades 9-12 58 
Formboard, in formboard 
Series AA standardization 
group 
Wechsler Block 242 Bo =: 56 
А ys, grades 9-12 E 
Designs Test in formboard 
standardization 
group 
Minnesota Mechan- 200 Adult, mal 63 
ical Assembly m 


white penitentiar: 
Test, Long Form Ру d 


и У о, oot 


Correlations of the Modified Formboard Series with Other Tests 

Ina previous article on the Modified Formboard Series’ it was sug- 
gested that this test would probably turn out to be a measure of certain 
aspects of “mechanical ability,” particularly the spatial visualization 


7 Wylie, R. C. The reliability of the Grove Modification of the Kent-Shakow 
Formboard Series. J. appl. Psychol., 1947, 31, 155-159. 


Kent-Shakow Industrial Formboard Series 609 


factor which has been found to be important for success in mechanical 
occupations. In this connection it is interesting to inspect available 
correlations between the Modified Kent-Shakow Series and certain other 
more commonly used tests. 

Table 2 gives correlations obtained on various groups between the 
Modified Formboard Series and commonly used "mechanical ability" 
tests. | 

Table 3 gives correlations between the Modified Formboard Series 
and certain commonly used tests which are not usually considered to be 
“mechanical ability" or “space factor" tests. 


Table 3 
Correlations Obtained between the Grove Modification of the Kent-Shakow Form- 
board Series and Some Commonly Used Tests of “General 
Intelligence" and “Scholastic Aptitude" 


Test with which 
Formboard Series 
was Correlated N Group r 
desc а па Т О eA a BE ok л 
Navy General 45 Submariners A8 
Classification Test entering Diesel 
Engine School 
Total Mental 208 Boys, grades 9-12 .33 
Factors, Cali- in formboard 
fornia Test of standardization 
Mental Maturity, group* 
Short Form 
Verbal Score on 74 College Women** .33 
Scholastic Apti- 
tude Test given by 
College Entrance 
Examination Board 
Math. Score on 74 College Women*** .29 
Scholastic Apti- 
tude Test given by 
College Entrance 


Examination Board 


* This group seemed to be comparable in range and distribution of scores to the 
group used in standardizing the California Test of Mental Maturity. — j 

** On the whole this group was fairly highly selected for verbal aptitude: percentile 
rank of the median score of this group = 72; however, the percentile ranks of the lowest 
and highest scores made by this group were 4 and 98. 

*** This group was not as highly selected for math aptitude as for verbal aptitude: 
the percentile rank of the median score of this group — 52; the percentile ranks of the 
lowest and highest scores made by this group — 7 and 98. 


610 Ruth С. Wylie 


It is not strictly justifiable to compare sizes of correlation coefficients 
obtained on groups differing greatly in range of talent on the Formboard. 
Neverthless the trend toward higher correlations in Table 3 suggests 
that the Modified Formboard Series does have more in common with 
‘mechanical ability" tests, especially tests which are apparently loaded 
with the space factor, than it does with so-called *'scholastic aptitude" 
or *general intelligence" tests which do not contain, or are not so heavily 
weighted with items requiring space visualization. 


Summary and Conclusions 


It has been shown that: (1) the Grove Modification of the Kent- 
Shakow Industrial Formboard Series predicted success in training on а 
mechanical job in the United States Navy with efficiency comparable to 
the Navy Mechanical Aptitude Test. The correlation between the 
Modified Formboard Series and final grades in Diesel Engine School 
for а highly selected group of candidates was .47; (2) for several different 
groups of subjects, the Modified Kent-Shakow Formboard Series cor- 
relates .5 to .6 with certain other "mechanical aptitude" tests which 
apparently involve the spatial visualization factor; (3) Pearson r’s of the 
order of .2 to .3 have been obtained between the Modified Kent-Shakow 
Formboard Series and “general ability” tests or tests apparently involving 
verbal and mathematical factors to a much greater degree than the 
spatial visualization factor. 


Received May 13, 1948. 


A Study of Two Techniques of Measuring 
“Mechanical Comprehension” * 


W. T. McElheny 
State University of Iowa 


The personnel manager, the vocational counselor, the psychologist, 
and others interested in the selection and placement of persons in work of 
a mechanical nature have long been concerned with the problem of deter- 
mining which of a multitude of mechanical tests should be used for pre- 
diction purposes. One of the primary issues has been whether to employ 
paper and pencil tests or performance tests. The former are the easier 
to give, but there has always been the suspicion that the latter might 
possibly be of more value in predicting success in mechanical occupations. 

This study was designed to provide some indication of the relation be- 
tween responses on performance or apparatus tests and paper and pencil 
devices which, it is assumed, were developed to predict essentially the 
same criterion measures. According to Bennett and Cruikshank (2), 
these are measures of “mechanical comprehension” —the “understanding 
of principles and relationships underlying mechanical operations” (p. 2). 

The Bennett Test of Mechanical Comprehension, Form AA (1) has 
been chosen as representative of the paper and pencil instruments. The 
Purdue Mechanical Assembly Test (3) represents the assembly type test. 
The assumption is made that these two devices are attempting to measure 
essentially the same phenomena but by two quite different testing media: 
paper and pencil, in the case of the former, and assembly, in the case of 
the latter. That this assumption is a reasonable one may be seen when 
both tests are examined in closer detail. In this study, then, the concern 
is with the relationship between performance on the two tests. Do both 
tests, in spite of their differences in form and method, yield much the 
same rank order of performance? deed, 

The Bennett test (1) was devised to measure the capacity of an indi- 
vidual to understand various types of physical relationships. It contains 
sixty pictorially presented mechanical problems, and since no mathema- 
tical or arithmetical computations are required and the verbal or reading 
element is reduced, it is Bennett’s claim that the effect of training and 
formal knowledge is minimized (2, p. 39). е 

* From unpublished М.А. thesis, Department of Psychology, State University of 
Iowa, August, 1948. 
611 


612 W. T. McElheny 


The Purdue Mechanieal Assembly Test as designed by Graney (3) 
consists of nine problem boxes of equal floor area. In each box a mecha- 
nism may be assembled in such а way that а mechanical action takes 
place. The first subtest serves as an introductory unit to acquaint the 
examinee with the nature of the task to be performed. Each of the sub- 
tests, when properly assembled, constitutes a mechanism which can be 
manually operated from the outside of the box by a crank or push-bar. 
Тће test was constructed along the broad outlines of the Stenquist Me- 
chanical Assembling Test and the Minnesota Mechanical Assembly Test 
but with the hope of eliminating certain defects present in them. The 
Purdue test has eliminated stereotyped mechanical contrivances in favor 
of new and novel mechanical problem situations, thus ruling out the 
effect of chance familiarity with the task at hand for some persons, 


regardless of age or experience. As Graney (3) has pointed out, however, . 


the fact that each sub-test employs in its assembly standard mechanical 
items such as levers, links, cams, gears, pinions, etc., tends to make the 
solution simpler for experienced mechanical workers as opposed to me- 
chanically naive laymen. The principles of assembly in each test are, in 
most instances, drawn from standard design practices. 

Further, the Purdue test is built of parts which are relatively large 
and strong and which must be constructed by skilled craftsmen. While 
this tends to increase the test in bulk, weight, and cost, it likewise tends 
to keep the test in practically the same condition for all test subjects. 

Certain deviations from the Purdue Mechanical Assembly Test аз 
designed by Graney (3) were made in the apparatus used in the present 
study. Minor modifications in the construction of two of the tests were 
made, and the subjects were provided with a small screw driver (and 
informed of its purpose) for use on these two tests. Graney's original 
design does not require the use of such a tool. The results presented here 
are based on scores made on seven boxes rather than the eight boxes 


which constitute the complete Graney test. In this study the assump- | 


tion is made that these changes in apparatus do not constitute significant 
changes in the original Graney test. 


Subjects and Experimental Procedure 

Subjects. А statistical analysis has been made of scores achieved by 
100 college students at the State University of Iowa, to whom both the 
Bennett test and the Purdue assembly test were administered. This 
group consisted of eighty male and twenty female subjects, ranging in 
age- from eighteen to forty-four, representing every school classification 
according to class, and twenty-two fields of specialization. The male 
group had a modal age of twenty-four, a modal school classification of 


Measuring “Mechanical Comprehension” 613 


college sophomore, and majors in the College of Commerce made up the 
highest frequency group. Of the female subjects, whose modal age was 
twenty-one and school classification was college senior, majors in psy- 
chology predominated. Without exception, the cooperation and effort 
of the subjects were considered to be extremely good. 

Experimental Procedure. The Purdue Mechanical Assembly Test 
was administered individually, by the writer, to each of the 100 subjects. 
Each of the sub-tests was timed and scored separately; timing was to the 
nearest half-second. The time required on each of the sub-tests was 
totalled separately for Forms A and B. These two sub-totals were then 
combined to yield a total time score for the entire test which was rounded 
to the nearest half-minute. ‘Timing was begun when the subject removed 
the top from the box and continued until his assembly was completed. 
In the event the subject failed to complete the assembly within the 
maximum time limit, his score would be that maximum. In such a case, 
the examiner demonstrated the correct assembly of that box and then 
the subject passed immediately to the next. The maximum time limit 
for each of the boxes is given in Table 1. 


Table 1 
Maximum Time Limits for Problem Boxes in the Purdue 
Mechanical Assembly Test 
ONE ey мења ИЕ НИЕ Е 
Box Time Limit (in minutes) 
1 7 2 ава kr oir ee Tee o 
А-1 5 
А-2 : 10 
A-3 15 
Total Form А 30 
В-1 5 
B-2 10 
B-3 15 
B-4 20 
Total Form В 50 
Total for Test 80 


. The Bennett Mechanical Comprehension Test, Form AA, was admin- 
istered and scored according to prescribed instructions. No time limit 
was imposed. In every case, it was given after the subject had taken the 
assembly test at some previous time. The time between the administra- 
tion of the two tests varied from one day to two months. 

The correlations of the two tests have been computed for the total 
group and for men and women separately. Estimates of the reliability 
Coefficient of each test are also provided. 


614 W. T. McElheny 


Results and Discussion 


Results. The correlations of the two tests, together with the means 
and the standard deviations of the scores, are presented in Table 2. 

It is immediately apparent that there is a significant difference in the 
mean scores made by men and women on both tests. The correlation 
coefficient of .70 between the two tests is based on scores achieved by 
both sexes combined and must, therefore, be considered as spuriously 
high. Scores made by the male subjects alone yield a coefficient of cor- 
relation of .63. In the present study, the Kuder-Richardson formula 
for computing test reliability (4) gave a coefficient of reliability of .86 for 
the Bennett test. When the scores made on the three tests in Series A of 
the Purdue assembly test were correlated with scores on the first three 
tests in Series B, the obtained coefficient was .57. The correlation be- 
tween scores made on Series A and those made on all four tests in Series 
B was .58. From these data, Spearman-Brown estimates of the relia- 
bility coefficients were .73 and .76, respectively. Scores achieved by the 
group of eighty male subjects only were used in estimating the above 
reliability coefficients. Thus, it is seen that the obtained correlation be- 
tween the two tests (for male subjects) is about .18 lower than the geo- 
metrical mean of their respective reliabilities. 


Table 2 
Correlations between the Purdue (P) and Bennett (B) Tests: 
Means and Standard Deviations of Scores 
ке——ЄЄ——————————————ЄЄ—Є—Є 


Стопр N т ‘Mp ср Мв св 
чиа reeset er. . Ms ез 
Males 80 63 52.10 12.46 45.45 8.41 
Females 20 40 68.82 7.74 34.00 6.67 


Males and Females 100 .70 55.44 13.45 43.16 9.30 
AA ALD eee M ош 


The lower correlation (r=.40) between the tests for female subjects 
would seem to be a result of the greater homogeneity of this group. In 
spite of this lower correlation in the case of female subjects, the standard 
error of estimate of both tests is less for women than for men. The error 
of estimate on the Bennett test (from scores on the Purdue test) is 6.14 for 
female subjects and 6.56 for males; the Purdue test shows a standard 
error of estimate (from the Bennett scores) of 7.12 and 9.72, for women 
and men, respectively. 

Discussion. The correlation obtained in this study between scores 
made by male subjects on the Bennett test and the Purdue assembly test 
indicates that in spite of the difference in form of the stimuli presented 
to the subjects, both tests are eliciting responses which lead to much the 


Measuring “Mechanical Comprehension” 615 


same rank order. While this finding suggests that a common element is 
being measured by both tests, the difference between the obtained and 
the “maximum” correlation is sufficiently great so that the variation in 
the method should be considered as an additional factor. 

The obtained coefficients of reliability are in close agreement with 
those which have been reported previously. Other studies (1, 3) have 
yielded reliability coefficients of .84 for the Bennett test and .77 for the 
Purdue test. 

From the standpoint of employability and administration, it is obvious 
that the Bennett mechanical test has many advantages over the Purdue 
assembly test. Among these are: (1) a large number of subjects may 
easily be examined simultaneously, (2) instructions for taking the test are 
self-explanatory, (3) test materials are readily available, and (4) a shorter 
time is required for taking the test. The significance of this latter ad- 
vantage can hardly be overemphasized. Using the average time re- 
quired by both men and women on the tests (fifty-five minutes for the 
Purdue and twenty-five minutes for the Bennett), the ratio of the time 
required for one subject is approximately two to one. If it were desired 
to test fifty subjects, and if only one set of the assembly apparatus were 
available, the ratio would become 100 to one! 

Before one decides which of two tests should be employed in a battery, 
however, he must answer the critical question of the extent to which each 
of them correlates with some outside criterion. At the present time there 
are no studies known to the writer which provide an estimate of the 
correlation of these two tests with the same criterion for strictly com- 
parable groups. In the standardization of the Bennett test, it was found 
(1) that the test has a correlation of .50 with the average grades from 
technical military courses. Graney (3) reports a correlation of .51 be- 
tween his test and industrial merit ratings of ninety-one machinists. 
Correlation with the combined ratings of six instructors of forty-eight 
apprentice machinists was .34. If further studies should demonstrate 
that the tests show equivalent correlations with the same criteria for 
comparable groups, then one would be justified in using the more easily 
employed Bennett paper and pencil test alone. An exception to this 
might judiciously be made, however, if there is reason to believe that 
the first test has been invalidated in any way during its administration. 
In such a case, the assembly test might serve as an excellent further check 
on the aptitude of the subject under consideration. Since these con- 

‘It hardly need be pointed out that all too frequently in vocational guidance, 
selection and placement of employees, etc., it is necessary to assume that since a test 
has been shown to correlate significantly with performance on one type of job thet it 
Will correlate with another which is considered, а priori, to be a related job. In other 
Words, there is an urgent need for the experimental determination of job families. 


616 W. T. McElheny 


clusions are based on a college population, further investigation will be 
necessary to determine whether the same relationship holds for particular 
occupational groups. 


Summary 


It was the purpose of this investigation to study the relation between 
performance on two tests, both of which purport to require the ability to 
reason about mechanical principles, but which use different testing 
media—one being a paper and pencil test and the other a performance 
test. The Bennett Test of Mechanical Comprehension, Form AA, was 
selected as representative of this first type test. The Purdue Mechanical 
Assembly Test was chosen as the performance test. 

The Purdue test was designed and constructed to be similar in prin- 
cipal to the Stenquist Mechanical Assembling Test and the Minnesota 
Mechanical Assembly Test. It embodies certain characteristics, e.g. 
sturdy, precision construction and non-stereotyped problem situations, 
which appear to be improvements upon those which were its prototypes 
(3). Eight sub-test problems compose the test, and they are divided 
into two forms of four sub-tests each. The sub-test order of administra- 
tion is, in each form, from the simple to the complex. Only seven sub- 
tests have been used in the present study—three in Form A and four 
in Form B. 

Both the Bennett test and the Purdue test were administered to 100 
college students at the State University of Iowa. The group consisted 
of eighty men and twenty women, whose ages ranged from eighteen to 
forty-four, who represented every school classification and twenty-two 
fields of specialization. In every case the Purdue test was administered 
first, and instructions given to the examinees were uniform. 

Scores made by the eighty male subjects on the two mechanical tests 
were found to correlate .63 with one another. This correlation, while 
about .18 lower than the maximum correlation as estimated from the 
reliabilities of the two tests, indicates that in spite of the difference in 
form of the stimuli presented to the subjects, both tests are eliciting re- 
sponses which lead to much the same rank order, at least in this type of 
population. It is pointed out, however, that the difference between the 
obtained and the “maximum” correlation is sufficiently great so that the 
variation in the method should be considered as an additional factor. 


The findings of this study seem to justify the conclusion that if future . 


studies should demonstrate equivalent and comparable correlations be- 
tween each of the tests and some outside criterion, then the more usable 
Bennett test may safely be employed instead of a performance test in 
guidance, selection, and placement. It is suggested that the assembly 
test might furnish a further check on a person's aptitude, if such is needed. 


Measuring “Mechanical Comprehension” 617 


It has been emphasized that while such a relationship may hold for a 
college population, further investigation is needed to determine if the 
same would be true for particular occupational groups. 


Received May 10, 1948. 


References 


1. Bennett, George K. Manual for test of mechanical comprehension, form AA. New 
York: The Psychological Corporation, 1940. 

2. Bennett, G. K., and Cruikshank, Ruth M. А summary of manual and mechanical 
ability tests. New York: The Psychological Corporation, 1942. 

3. Graney, Maurice R. The construction and standardization of the Purdue mechanical 
assembly test. Ph.D. thesis, Purdue Univ., 1942. 

4. Kuder, С. F., and Richardson, M. W. The theory of the estimation of test relia- 
bility. Psychometrika, 1937, 2, 151-160. 


Norms for the Test of Mechanical Comprehension 


Clifford E. Jurgensen 
Minneapolis Gas Company 


Applied psychology textbooks and publishers of psychological teats 
frequently advise users to establish their own test norms. The assumpe 
tion is made that norms established for a specific type of work within & 
single company are more useful to that company than are more general 
norms. 

Companies which do establish their own norms are frequently con- 
fronted with the fact that such norms differ considerably from those 
published in test manuals. Although the company norms may be more 
useful to that company than test manual norms, it is nevertheless im- 
portant to know to what extent these differ from the published norms. 
Such information is useful in determining, for example, whether applicants 
of the company are superior, equal to, or inferior to applicants of other 
companies (published norms). 

In other cases, such as when tests are used for guidance purposes, it 
may be undesirable or impossible to develop usable local norms. In such 
cases it is important to know whether or not the norms published in the 
test manuals are consistent with those obtained by other workers. 

As is the case with test norms, data on intercorrelations of one test 
with others are also needed. 

Normative and intercorrelational data are published here on the 
Test of Mechanical Comprehension, Form BB.! Data were obtained 
from applicants for mechanical work consisting of (1) installing or re- 
pairing gas main or pipe, (2) installing, adjusting or repairing gas appli- 
ances such as ranges, refrigerators, water heaters, house heaters, lete., (3) 
repairing meters (soldering, sheet metal Work, etc.) and (4) miscellaneous 
mechanical occupations such as electrician, welder, machinist, stationary 
firemen, auto mechanic, etc. Some applicants applied for a specific job 
and others applied for work within the broad area of mechanical" work. 
No applicant who was hired has worked (or will work) in all of the above 
jobs, although versatility is considered desirable. In general the group 
corresponds rather closely to the typical "mechanical applicant group" 
and norms based on this group would be expected to agree with Bennett's 
ба By Bennett, С. K., and Fry, D. E. Published by the Psychological Corporation, 
1941. 

618 


Norma for the Тем of Mechanical Comprehension 619 


Er Cae M t 
uately established group does not xysxtematically 
differ from such a standardization group. 

Test scores on 2000 cases included in this study gave à mean score 
of 28.2 and a sigma of 10.9. These agree rather closely with Bennett 
and Fry who reported! a mean of 29.1 and a sigma of 11.0. 


Table 1 
Comparison of Two Distributions of Scores of Applicants for Mechanical Work 


Bennett Gas Co, and муљ Gas Co. 
and 
Percentiles (anon) "Rcge ie ng е, 
v ы 53 55 м 
95 48 47 47 46 
90 45 42 43 2 
85 42 40 ET] 40 
80 39 39 38 37 
75 36 35 36 35 
70 35 m 35 и 
65 33 32 33 32 
60 32 з з2 31 
55 30 30 31 30 
50 28 28 29 28 
45 2 2 28 7 
40 26 5 26 25 
35 21 24 25 24 
30 23 23 23 2 
25 21 20 22 21 
20 19 19 20 19 
15 17 17 18 17 
10 15 14 15 14 
5 13 10 п 10 
1 6 4 3 3 
Number of Cases 435 2000 435 2000 
Score Mean 29.1 282 29.1 282 
Score Sigma 1L0 10.9 110 10.9 


Raw test scores corresponding to the selected percentiles used in the 
test manual were computed from the data on 2000 job applicants. 
Table 1 indicates the close agreement between the two distributions at 
all points. For the twenty one selected percentile points, eight have 
е same raw score, nine differ by only one raw score unit, two differ by 
two units, and two differ by three units. Test manual scores and those 
2 Bennett, G. K., and Fry, D. E. Manual of Directions, Test of Mechanical Com- 
ion, Form ВВ. New York: The Psychological Corporation, 1941. 


620 Clifford E. Jurgensen 


reported here were plotted on Otis’ Normal Percentile Chart? and the 
remaining percentile points were compared.  Discrepancies were even 
less than for the selected percentile points due to the fact that selected 
percentile points were rounded to the closest whole number. These 
comparisons were based on original data without any smoothing of the 
curves. Smoothing of the distribution resulted in closer agreement. 

The best fitting normal curves based on the mean and sigma of the 
two distributions were caleulated. "Very close agreement was found be- 
tween the two curves and both agreed very closely with the selected per- 
centile points of the two original distributions. 


Table 2 
Correlation with Other Measures 
EE ЕБЕ ________________аА 
Correlation with Test, 
of Mechanical 
Comprehension, 
Measure Mean Sigma Form BB 

Age 27.6 8.0 —.02 
Education 11.3 1.8 +.35 
Vocabulary 28.7 5.1 +.30 
Abstraction 12.8 4.3 +.46 
Mental Ability 54.2 11.8 +.50 
Conceptual Quotient 92.7 14.9 +.32 


Bennett and Fry have reported correlations of the Test of Mechanical 
Comprehension with various mental ability tests (College Board Ex- 
amination, American Council Psychological Examination, and Modified 
Alpha Examination). These range from .10 to .41. Correlation of the 
Test of Mechanical Comprehension with other measures were computed 
for 500 randomly selected cases from the applicant group discussed here. 
The measures used were age, education, and the various scores obtained 
on the Shipley-Hartford Institute of Living Scale. The mean and sigma 
of this group on the Test of Mechanical Comprehension were 29.2 and 11.1 
respectively, and so are almost the same as for the total applicant group 
reported here as well as that reported in the test manual. Results 
(given in Table 2) show that the Test of Mechanical Comprehension is 
relatively independent of these other measures, 


Conclusions з 

Norms and intercorrelations reported here are in close agreement 

with those reported by Bennett and Fry. Users of the test who cannot 
?! Published by World Book Company, 1938. 


* Published by the Institute of Living (formerly called Hartford Retreat), Hartford, 
Connecticut. 


Norms for the Test of Mechanical Comprehension 621 


develop norms for their specific situations can thus place more confidence 
in the published norms than is frequently the case. The close agreement 
between the norms given in the manual and those reported here does not 
mean that users should not devise their own norms when conditions per- 
mit. It does mean, however, that lack of such agreement (if found by 
others) should not hastily be interpreted to mean that the test manual 
norms are inaccurate. In case of such disagreement a profitable search 
might be directed toward finding out why the obtained norms are not 
comparable to the published norms. 

Received September 18, 1948. 

Early publication, 


Reliability of Abbreviated Job Evaluation Scales * 


David J. Chesler 
Personnel Research Institute, Western. Reserve University 


The purpose of this investigation was to compare the abbreviated job 
evaluation scales that would be derived by application of the Wherry- 
Doolittle selection method (6) when all variables except the raters were 
held constant. 

In any comparative study of job evaluation systems there are at least 
three variables to be considered. These are the job evaluation manuals, 
the jobs to which the manuals are applied, and the job evaluators or 
raters. There have been few studies, if any, of abbreviated job evalu- 
ation scales in which any of these variables was held constant enough to 
permit direct comparisons of the results obtained. 

In the present study two variables were held constant throughout: 
(1) the job evaluation manual; and (2) the jobs. 

In a sense the present study may be considered a preliminary but 
basic analysis of abbreviated job evaluation scales derived by the Wherry- 
Doolittle selection method. This method was first applied to job evalua- 
tion systems by C. H. Lawshe, Jr. (2) who may be considered the ''dis- 
coverer" of the abbreviated scale and also its chief advocate. At the 
present writing Lawshe and various associates have published four studies, 
(2, 3, 4, 5) which utilized this technique. However, neither within any 
of these studies, nor among them, were the variables held constant enough 
for direct comparisons. 

The present investigation has attempted to answer the question: 
How “reliable” is the Wherry-Doolittle selection method as applied to 
job evaluation systems? That is, given the same jobs and the same job 


evaluation manual, will the same abbreviated scales be derived with 
different raters? 


Method 
Job raters in four industrial organizations rated independently de- 
scriptions and specifications for 35 salaried jobs on the same job evalua- 
tion manual. The jobs and the manual are the “standard jobs” and 
“standard manual” reported in a previous study (1), and the organiza- 
tions and raters are also the same as those reported previously. The 
standard manual was a typical point rating manual with 12 factors. 


* A condensation of a portion of a Ph.D. thesis submitted to the Graduate School 
of Western Reserve University in 1948, 


622 


Reliability of Abbreviated Job Evaluation Scales 623 


Results and Interpretation 


Abbreviated Scales Derived from the Standard Manual. The Wherry- 
Doolittle selection method was applied to the standard manual factor 
ratings submitted independently by the raters in companies A, B, and C, 
with total point rating as the statistical criterion. This is exactly the 
same procedure followed by Lawshe and various associates (2, 3, 4, 5). 
However, the difference between the present study and those conducted 
by Lawshe is that here there was rigid control of the jobs rated and the 
manual used, that is, all raters rated the same jobs on the same manual. 

The abbreviated scales identified with different raters evaluating 
the same jobs on the same manual are presented in Table 1. 


Table 1 


Abbreviated Seales Derived from the Standard Manual by Raters in 
Three Companies Who Rated the Same Jobs 


ке ——————MM—— 


Co. A Co. B Co, C 
FactorNo. R FactorNo. R FactorNo. R 
1 .892 1 .896 4 .902 
4 .954 4 .946 1 .961 
5 .965 8 .972 8 .972 
8 .976 5 .980 5 .987 
3 .983 2 .990 
10 .989 


Key to factor numbers: 1. work experience; 2. essential knowledge and training; 
3. dexterity; 4. character of supervision received; 5. character of supervision given; 
8. responsibility for confidential matters; and 10. responsibility for accuracy —effect 
of errors. 
Меш. | Мрмак se Voc А RE 
The Wherry-Doolittle technique was applied first to the data sub- 
' mitted by Co. A, and it was continued until six factors had been identified. 
Carrying out the Wherry-Doolittle process to this length was an ex- 
ploratory measure to obtain an idea of the magnitude of the shrunken 
R’s that might be expected. It was decided to stop the Wherry-Doolittle 
process when the shrunken R attained a magnitude of .980. However, in 
the case of Co. C the correlations between factor 5 and the other factors 
had to be computed for another purpose, 80 that it was relatively simple 
to identify an additional factor, namely factor 2. 

It will be noted that the first four factors identified with each group 
of raters were the same, although the order in which they were identified 
was not the same. These four factors are “work experience,” “charagter 
of supervision received,” “character of supervision given,” and “те- 
sponsibility for confidential matters.” Differences in the three instances 


624 David Ј. Chesler 


with respect to the order in which the factors were identified are appar- 
ently due to differences among the raters since the jobs and the job evalua- 
tion manual were constant throughout. 

If three factors are decided upon to comprise the abbreviated scale, 
then the same three factors have been identified in two out of three in- 
stances. This point is mentioned because in various studies (2, 3, 4, 5) 
Lawshe and his associates identified three factors to comprise the ab- 
breviated scale. 

Adequacy of Abbreviated Scales Derived from Standard Manual. A 
test of the adequacy of an abbreviated job evaluation scale is the degree 
to which jobs are displaced from the labor grades in which they were 
placed by the original scale. In the present study a comparison was made 
of the accuracy with which three abbreviated scales would predict the 
original values assigned to the jobs. Each abbreviated scale consisted 
of the same four factors and each was derived from the same original 
manual which was applied to the same jobs, but by different raters. In 
order to make this comparison, the three separate multiple regression 
equations for predicting total points from point ratings on “work ex- 
perience,” “character of supervision received,” “character of supervision 
given,” and “responsibility for confidential matters’ were computed. 
These three prediction equations were as follows: 


Со. A: ТРзм = 1.2F; + 2.3F, + 1.4Е, + 1.6Fs + 66.0 
Со. B: TPsm = 1.4F; + 1.78, + 2.1Fs + 1.2F; + 57.2 
Co. С: TPsm = 2.6F, + L3F; + 1.5Fs + 1.6F; + 53.3 


“TP” and “SM” indicate “total points” and “standard manual,” 
respectively. The standard errors of estimate for ‘nese three equations 
were 13.6, 12.1, and 10.9 respectively. The multi“ е R’s for these three 
equations were .98, .98, and .99 respectively. All of these R’s are signifi- 
cant at the one per cent level. These prediction equations were applied 
to the ratings given on the four factors by the raters in Co. A, Co. B, 
and Co. C, respectively, and three corresponding sets of predicted scores 
were obtained. 

A uniform labor grade of 25 points was adopted as the classification 
plan for the standard manual so that the comparative adequacies of the 
three abbreviated scales could be easily studied. The range of points 
for the classification plan of the standard manual was 400 points (a 
minimum of 100 points for any job, and a maximum of 500 points), so that 
there was a total of 16 labor grades. The results are presented in Table 2, 
which shows the per cent of jobs in each instance which remained in the 

same labor grade or which were displaced into another labor grade, It 


will be noted that all the jobs remained in the’ same labor grade as the 


ит — M RR 


Reliability ој Abbreviated Job Evaluation Scales 625 


Table 2 


Labor Grade Displacement for 35 Standard Jobs with Abbreviated 
Scales Derived from Standard Manual 


——————— 


Co. А Co. B Со. С 
Labor Grade ———— ———— ———— 
Displacement f % f % f % 
+1 12 34.3 7 20.0 5 14.8 
0 17 48.6 18 51.4 26 74.3 
=] 6 17.1 10 28.6 4 11.4 
Totals 35 100.0 35 1000 35 100,0 


original classification or were displaced into a labor grade adjacent to that 
of the original classification. 


Another analysis of the same data is presented in Table 3, which shows 
how ratings with the abbreviated scale deviated as much as 12.5 points 


(0.5 labor grade), 25 points (1.0 labor grade), and more than 25 points, 
from total points on the original scale. In three instances 62.8 per cent, 
68.5 per cent, and 74.2 per cent of the predicted ratings deviated from 
the original ratings by 12.5 points or less. In two instances 94.2 per cent, 
and in one instance 97.1 per cent of the predicted ratings deviated from 
the original ratings by 25 points or less. It should be noted that the 
standard errors of estimate (13.6, 12.1, and 10.9) for the three prediction 
equations are approximately equal to 12.5 points or 0.5 labor grade, indi- 
cating that about 68.2 per cent of the predicted scores would be within 
approximately the value of 0.5 labor grade of the original scores. 
Comparison of Multiple Regression Equations for Abbreviated Scales 
Derived from Standard Manual. ‘The problem to be analyzed here is the 
similarity of the three multiple regression equations computed for use 
with the abbreviated scales derived from the standard manual. Put in 


Table 3 4 
Point Deviation for 35 Standard Jobs with Abbreviated Scales 


Derived from Standard Manual 


Со. А Со. В Со. C 
Point Deviation f 96 f 96 f 96 
25.01 plus 1 2.9 1 2.9 — — 
12.51 to 25.00 5 14.3 3 8.6 3 8.6 
0 to 12.50 12 34.2 15 42.8 17 48.5 
0 to —12.50 10 28.6 9 25.7 9 25.7 
—12.51 to —25.00 6 17.1 6 17.1 5 14.8 
—25.01 minus 1 2.9 1 2.9 1 ‚2-9 
"Totals 35 100.0 35 100.0 35 100.0 


626 David Ј. Chesler 


practical terms, the problem may be phrased thus: How much difference 
is there among the three multiple regression equations, derived inde- 
pendently in three instances and designed to predict three independent 
sets of standard manual total scores, in predicting a fourth set of standard 
manual total scores? 

'The standard manual ratings obtained in Co. D were used as the 
fourth set of ratings to which the three multiple regression equations ob- 
tained in companies A, B, and C were applied. That is, each of the three 
multiple regression equations was applied to the ratings assigned by the 
rater in Co. D to “work experience,” “character of supervision received,” 
“character of supervision given,” and “responsibility for confidential 
matters.” Comparisons were then made of the predicted scores obtained 
by application of each of the three multiple regression equations and the 
total scores assigned by the rater in Co. D. 


Table 4 
Comparison in Terms of Labor Grade Displacement of Total Points Assigned by Rater 
in Company D to 35 Standard Jobs and Predicted Total Points Computed 
from Prediction Formulae of Raters in Companies A, B, and C 


Co. А Co. Co. 6 

Labor Grade ————— pore Sr e 
Displacement f 96 f % f % 
+2 1 2.9 1 М 1 2.9 
+1 20 571 9 25.7 14 400 
0 12 343 16 45.7 15 42.8 
-1 2 57 9 25.7 5 14.8 

-2 — -— 1 2.9 — же: 
Totals 35 1000 35 1000 35 100.0 


"Table 4 shows the per cent of jobs in each instance which remained in 
the same labor grade as total points assigned by the Co. D rater placed 
the jobs, and also the per cent of jobs which were displaced one or two 
labor grades when compared with the labor grade classification deter- 
mined by the rater in Co. D. It will be noted that in all three instances 
97.1 per cent of the jobs either remained in the same labor grade as the 
rater in Co. D classified them, or they were displaced into an adjacent 
labor grade. So far, this finding indicates that for practical purposes the 
three prediction formulae are very much alike, 

Table 5 shows an analysis of the same data in terms of point deviation. 
In all three instances 91.4 per cent of the jobs deviated 25 points or less 
from the values assigned to the jobs by the rater in Co. D. This con- 
firnis the conclusion that for practical purposes the three prediction 

formulae are very much alike. 


Reliability of Abbreviated Job Evaluation Scales 627 


Table 5 


parison in Terms of Point Deviation of Total Points Assigned by Rater in 
Company D to 35 Standard Jobs and Predicted Total Points Computed 
from Prediction Formulae of Raters in Companies A, B, and C 


Co. A Co. B Co. C 
f % f % { % 
2 5.7 1 2.9 2 57 
14 40.0 7 20.0 8 22.9 
9 25.7 12 34.2 13 37.1 
7 20.0 8 22.9 9 25.7 
2 5.7 5. 14.3 2 57 
1 2.9 2 5.7 1 2.9 
35 100.0 35 100.0 35 1000 
4 Summary and Conclusions 


. 1. The basic methodological feature of the present study was to have 
raters in various companies evaluate a standard set of job descriptions 
specifications for 35 representative salaried jobs on a standard 
The standard manual was of the point rating type and con- 
ed 12 factors. 

_ 2. The Wherry-Doolittle selection method was applied to the standard 
E factor ratings submitted by analysts in three companies. The 
first four factors identified in each company were the same, although the 
“order of identification was not the same. These four factors were “work 
experience,” “character of supervision received,” “character of super- 
‘vision given,” and “responsibility for confidential matters.” The first 
three factors identified were the same in two of the three companies. 
| Differences in the three companies with respect to the order in which the 
factors were identified are apparently due to differences among the raters 
since the jobs rated and the job evaluation manual used were constant 
for all raters. 
_ з. Application of the three abbreviated scales, each containing the 
_ вате four factors, resulted in all of the jobs remaining in the same labor 
grade as the original classification, or in being displaced into a labor 
grade adjacent to that of the original classification. In the three com- 
panies 62.8 per cent, 70.5 per cent, and 74.2 per cent of the predicted 
ratings deviated from the original ratings by the point value of 0.5 labor 
А е or less; similarly 94.2 рег cent, 94.2 per cent, and 97.1 per cent of 
. the predicted ratings deviated from the original ratings by the point value 
- €f 1.0 labor grade or less. 


628 David Ј. Chesler 


4. Application of the three abbreviated scales to a fourth independent 
set of ratings resulted in three sets of predicted scores which were, as a 
whole, very much alike, as measured in terms of labor grade displace- 
ment and point deviation. 

5. The present investigation substantiates the findings of Lawshe and 
various associates (2, 3, 4, 5) that abbreviated job evaluation scales 
justify themselves from the standpoint of technical and scientific accuracy 
and economy. In this connection, however, it may be pointed out they 
may not justify themselves psychologically, since they are liable to create 
& belief among employees that all aspects of each job have not been fully 
considered. 

Received J'uly 12, 1948. 
Early publication. 


References 


1. Chesler, D. J. Reliability and comparability of different job evaluation systems. 
J. appl. Psychol., 1948, 32, 465-475. 

2. Lawshe, С. H., Jr. Studies in job evaluation: II. The adequacy of abbreviated 
point ratings for hourly-paid jobs in three industrial plants. J. appl. Psychol., 
1945, 29, 177-184. 

3. ——, and Alessi, S. L. Studies in job evaluation: IV. Analysis of another point 
rating scale for hourly-paid jobs and the adequacy of an abbreviated scale. 
J. appl. Psychol., 1946, 30, 310-319. 


4. ——, and Maleski, A. A. Studies in job evaluation: 3. An analysis of point ratings 
for salary paid jobs in an industrial plant. J. appl. Psychol., 1946, 30, 117-128. 

5. ——, and Wilson, В. F. Studies in job evaluation: 5. An analysis of the factor 
comparison system as it functions in а paper mill. J. appl. Psychol., 1946, 30, 
426-434. 


6. Stead, W. H., Shartle, C. L., and Associates. Occupational counseling techniques. 
New York: American Book Co., 1940. 


А Note on Machine Scoring the Kuder Preference Record 


Louis Lauro 
The City College of New York 


Тће use of the Kuder Preference Record has greatly increased in re- 
cent years, particularly since it has become available in machine scoring 
form. The time required on the original hand-scoring form was 15 to 20 
minutes per paper. The IBM machine-scored answer sheet for the 
Kuder must be inserted eighteen times in the scoring machine. On a 
trial run of thirty answer sheets the scoring time was found to be 62 
minutes. This is indeed a saving in time over the hand scoring method, 
but the time can be reduced still further. 

Scores for nine different occupational areas are obtained on the Kuder. 
The score for each area is based on certain responses. Most of the re- 
sponses are counted towards the score for more than one area. However, 
no response which is counted towards the score for Area 2 (Computa- 
tional) is counted towards the score for Area 3 (Scientific). Similarly, no 
response which is counted towards the score for Area 6 (Literary) is 
counted towards the score for Area 7 (Musical). "Therefore, in machine 
scoring, if a set of keys (“‘Elimination” and Rights") is so punched that 
the score for Area 2 appears on the “Rights” circuit and the score for 
Area 3 on the “Wrongs” circuit, both these scores can be obtained with 
one insertion of an answer sheet. In a similar manner, if a set of keys 
is so punched that the score for Area 6 appears on the “Rights” circuit, 
and the score for Area 7 on the “Wrongs” circuit, both of these scores 
can be obtained with one insertion. All positions except those responses 
that are counted towards the score for Area 3 (or 7) are punched on one 
key (*Elimination" key) and this key is used in conjunction with a key 
(“Rights” key) on which all the responses that are counted towards the 
Score of Area 2 (or 6) are punched. Both “R” and “W” field selection 
holes should be punched on the elimination key for all ten fields. 

However, in practise, two scores are obtained for each interest area 
simply because the number of items on the Kuder necessitates the use of 
both sides of the answer sheet. So, scores for Area 2 and for Area 3*can 
be obtained for each side of the answer sheet by one insertion. Similarly, 


629 


630 Louis Lauro 


scores for Area 6 and for Area 7 can be obtained for each side of the an- 
swer sheet by one insertion. The total number of insertions per answer 
sheet (as well as the total number of key changes) is thus reduced from 
eighteen to fourteen. The reduction in scoring time was found to be 
seven minutes for thirty answer sheets, or a saving of 11 per cent. 

The application of this principle to shorten the scoring time of the 
Strong Vocational Interest Blank is not possible since there are no two 
occupational-group scales which do not have responses in common. 


Received April 9, 1948. 


The Use of Rating Scales and Personal Inventories 
to Check Each Other 


James D. Weinland 
New York University 


Rating scales and personality inventories have been called, “the most 
used and worst form of human measurement." In regard to the first 
characteristie of being widely used there is little question. Merit ratings 
and interview ratings are familiar throughout business and industry. 
Personal inventories are almost as common as cross word puzzles. In 
regard to the second characteristic of their being the worst forms of 
measurement there might be some difference of opinion. They are both, 
however, subjective instruments and the difficulty of completely validat- 
ing either one of them has been insurmountable. 


Purpose of Study 


The purpose of this study is to demonstrate a method of improving 
the validation and use of rating scales and personal inventories by 
using them on the same individuals to check each other. 

In the early days of rating scales Hollingworth tried self ratings and 
found them to be completely unreliable. It appears that people cannot 
evaluate themselves very well on a few points only, giving relative judg- 
ments on each, as a rating scales demands. But experience indicates 
people can do what amounts to the same thing if the form is changed. 
They can, with some accuracy, answer many personal questions yes or 
no. In brief they can fill out a questionnaire or personal inventory. 

It becomes possible then to make out a rating scale for the use of 
others, and an inventory on the same attributes, to be filled out by the 
subject himself, In this case the same qualities are being measured by 
different instruments and the validity of these instruments can be checked, 
E some extent, by comparing the measurements obtained. This was 

one, 


Procedure 


_ A graphic rating scale and a personal inventory, with the same divi- 
sions or subheadings were constructed to measure personal efficiency. 
The inventory was administered to 57 subjects. These subjects were 
rated by three people each on the same attributes that had been measured 


631 


632 James D. Weinland 


by the inventory. The method of obtaining the ratings is one that might, 
under some circumstances, be useful elsewhere. Each subject was re- 
quested to name one of his acquaintances who knew a number of his 
other friends. The subject then wrote his name on each of the three 
rating scales, placed them in stamped envelopes addressed to the author 
of this article, and handed them to the friend named with the request that 
he distribute the rating scales to three people who could and would rate 
him, but who would remain unknown to the subject. It was stressed 
that the subject of the ratings would not try to find out who was rating 
him, and that the raters might therefore be assured of remaining uni- 
dentified. With the rating scales was a note explaining the situation 
and suggesting that fairness rather than leniency or flattery would do 
their acquaintance the most good and be the act he would consider the 
most friendly. 

When the rating scales were returned, the descriptions were replaced 
with numerals, from the highest to the lowest on each line, with the 
following values, 100—75—50—925—0. If the rater had checked in be- 
tween descriptions the average values of 87.5—62.5—37.5—12.5 were 
used. These values, it will be observed, harmonize very well with the 
percentiles obtained for the inventory. 

Each subject had been rated by three individuals so these ratings for 
each attribute were averaged, and this average charted on the psycho- 
graph. The inventory was handled by establishing norms and obtaining 
percentiles. The attributes studied carried the definitions and explana- 
tions given below. Realism; the ability to see facts and act accordingly. 

. Economy; thrift, making the most of what one has-time-energy-things. 
Appreciations; interests, tastes, and hobbies. Sense of Justice; open- 
mindedness, fair play, weighs both sides of a question. Motivation; 
drive, degree of activity and amount of energy expended. Investment 
in Self; learning in and out of school. Personal Integration; goals and 
plans, degree of personal organization. Social Integration; appearance, 
manner and social skills. 

The psychograph, presented below is that of one person, subject A. 
The self estimates were obtained from the personal inventory scores and 
the ratings by the use of the graphic rating scale, described above. 

At first it might seem misleading to put on one psychograph two sets 
of data arrived at by different techniques. It is probably true that were 
there but one score from the rating scale and one from the inventory 8 . 
direct comparison of them would not be legitimate. But from each 
there are a number of scores that, make a pattern, and a comparison of 
the patterns may be very useful. 


Rating Scales and Personal Inventories 633 


5 Icd 
~ 3 
WP 


С ЗЕ ЭЗЕ И à3ss5 


| | oM. 
Realism] Economy |Apprec: Motiva- [In { Integre 
tions tions ,, tion tion 
Person Soc 


Ета. 1. Comparison of personal inventory scores with rating scale scores, ` 
Subject A. 


An immediate value in comparing directly the two forms of measure- 
ment is that it is often doubly reassuring to the subjects. Many people 
do not like to be rated by others only, feeling that they are not thoroughly 
understood or sympathetically judged. Some people like to know what 
others think of them. The majority of our subjects displayed as much 
interest in the comparison of the two types of scores as they did in the, 
basic experience of being measured and receiving indications of their 
relatively high or low standing. ; 

In plotting psychographs for the fifty seven subjects some interesting 
contrasts appeared. There are individuals who (a) consistently make 
higher scores on the inventory than those given them on the rating 
scales; (b) those who are consistently rated higher by others than are the 
Scores they make for themselves on the inventory; and (e) those, some 
of whose rating scores are higher, and some of whose inventory scores 
are higher. 


Correlations 


Reliabilities for the eight sections of the personality inventory ob- 
tained by the split-half method are given below in the first column of 
Table 1. Correlations between results obtained by inventory and rating 
scale methods are shown in column two of the same table. 

Examination of the various correlations suggests some of the values 
that will accrue from making the comparison of ratings and inventories. 
The highest intercorrelations are in social integration and realism. Both 


634 James D. Weinland 


Table 1 


Reliability of the Personal Inventory and Correlation between 
Inventory and Ratings 


—— M M——M—M—M——— 


Inventory Rating-Inventory 
Traits Reliabilities Correlation 
Realism 77 70 
Economy 76 37 
Appreciations 81 .33 
Sense of Justice 79 21 
Motivation 85 40 
Self Investment 31 15 
Personal Integration .82 .29 
Social Integration .66 .59 


of these characteristics are of the type that seem to permit observation 
by others. Low correlations were found in self-investment, sense of 
justice, and personal integration. These characteristics are not so ob- 
vious, nor so subject to observation by others. 

Some of our partial-inventory reliabilities can be raised and the in- 
ventory is being item analysed to improve it. But even as it stands the 
scores from the inventory throw a contrasting light on the ratings, 
particularly those of deep or hidden characteristics. Where correla- 
tions are high considerable confidence in the results may be assumed. 
Where correlations are low further work on the measuring instruments, 
both inventory and rating scales, is suggested. In the meantime, the 
subject measured is protected in that he is not given solely, either ex- 
treme score. 


Summary 


1. The use of ratings and personal inventories can be improved by 
constructing them in parallel to measure the same individuals, in the 
same characteristics, by different methods. The two instruments check 
and to some extent may be said to correct each other. 

2. Individuals so far tested responded favorably to the double 
measurement. They were happy to have the chance to protect them- 
selves, and equally glad to see how the opinions of others compared with 
their own. 

3. Low correlations between results of the two methods indicate 
further work is needed on the measuring instruments, and caution is 
suggested in the use of the data. The higher the correlations betwee? 
thé results obtained by the two instruments the greater the confidence 

that may be felt in the result. 


Rating Scales and Personal Inventories 635 


4. A number of interesting clinical possibilities are indicated by the 
sethod; particularly that of examining personalities whose inventory 
es are either consistently above, or below their rating scores. 


March 12, 1948. 


References 


Drake, M. J., Roslow, S., and Bennett, G. K. The relationship of self rating and 
4 classmate rating on personality traits. J. erp. Educ., 1939, 7, 210-213. 
2. Driver, В. S. The validity and reliability of rating. Personnel., 1941, 17, 185-191. 
. Flory, C. D. Personality rating of prospective teachers. Educ. Adm. Supervis., 
1930, 16, 135-143. 
4. Fowler, N. Turn about is fair play! J. higher Educ., 1946, 17, 131-135. 
5. Frenkel-Brunswik, E. Mechanisms of self deception. J. soc. Psychol, 1939, 10, 
409—420. 

`6. Kubo, Y. On the self analysis of character traits. Oyo Shinri Kenkyu (Jap. J. 
____арра. Psychol), 1933, 1, 69-82. 

7. Kubo, Y. Judgments of character traits in self and others. Oyo Shinri Kenkyu 
= (Jap. J. appl. Psychol.), 105-116. 
`8. Newcomb, T. An experiment designed to test the validity of a (self) rating tech- 
___ nique. J. educ. Psychol., 1931, 22, 279-289. 

9. Remmers, Н. H., and Martin, R. D. Halo effect in reverse—are teachers ratings 
— of high school pupils valid? J. educ. Psychol., 1944, 35, 193-200. 
10. Rosca, A. Valorea auto evaluarii (The value of self evaluation). Rev. Psicol., 
3 1940, 3, 33-41. 
11. Ryans, D. G. Students appraisals of their own abilities compared with objective 
— — estresults. Psychol. Bull., 1940, 37, 467-408. 

2. Simpson, R. M. Self rating of prisoners compared with that of college students. 
J. soc. Psychol., 1933, 4, 464-478. 


The “Liberalism” of Congressmen Voting For and 
Against the Taft-Hartley Act 


Philip Ash 
The Pennsylvania State College 


The Taft-Hartley Act constitutes one of the most striking recent in- 
stances of division between "liberals" and “conservatives.” However, 
as yet no studies have appeared that provide concrete evidence for the 
extent and character of this division. 

This paper is intended as a brief note on the problem with respect to 
congressmen as they voted on the motion to override the veto of the 
Taft-Hartley Act. 


. Source of Data 

In the February 1948 issue of the Journal of Applied Psychology, 
Brimhall and Otis! reported a study of the consistency of voting of con- 
gressmen on a number of important issues over a five-year period. Using 
аз source data, reports in the New Republic in which the votes of 512 
congressmen were recorded as "progressive" or "anti-progressive," they 
assigned values on a seven-point scale (from “liberal” to conservative") 
to each congressman for each of four report periods studied. 

In addition, for those congressmen for whom data were available for 
at least two report periods, including the last one (1947), they gave an 

average scale value. No congressmen were included in their study whose 
first election was to the Eightieth Congress. 

The average ratings they provided were used in the present study. 
Where no average rating was computed by them, the ratings available 
were averaged for the purposes of the present paper. 

"Their tables were checked against the roll calls on the vote to override 
the Taft-Hartley veto in the Congressional Record of June 20, 1947 (for 
the House of Representatives) and June 23, 1947 (for the Senate). The 
congressmen on whom they presented ratings were then broken down into 
four groups: those who voted to sustain the veto, those who voted to 
override, those who were absent, and those who were not re-elected to 
the Eightieth Congress. 

! Brimhall, Dean R., and Otis, Arthur 8. Consistency of voting by our congress 
men. J. appl. Psychol., 1948, 32, 1-14. 

„ЗА vote to sustain is defined as "liberal"; а vote to override as “conservative.” 
It should be understood that “liberalism” as used here refers to the unvalidated judg- 
ment of the New Republic. 

636 


“Liberalism” in Voting on Taft-Hartley Act 637 


Distribution of Votes on the Taft-Hartley Act 


The Brimhall and Otis study did not, as has been indicated above, 
include ratings for all of the 96 senators and 435 representatives who com- 
prised the Eightieth Congress. Furthermore, of the senators, only 73 of 
those listed by Brimhall and Otis were in the Eightieth Congress; 15 
were not. Of the representatives, 310 were in the Eightieth Congress, 
125 were not. Those who were not in the Eightieth Congress failed to 
stand for election, or failed to be re-elected after serving in а previous 
Congress. 

However, over 70% of the vote cast in both the House and the Senate 
is represented in the present study. In the case of both houses, a slightly 
higher proportion of the “progressive” vote is represented (78% in the 
House and 88% in the Senate) than of the “conservative” vote (71% in 
the House and 72% in the Senate.) 

Those who voted to sustain, both in the House and in the Senate, 
ranged from "liberal" (rating of 1 or 2) to at least “middle of the road” 
(rating of 4 or 5). However, in the House 87.7% of the votes to sustain 
were cast by the definitely “liberal,” while in the Senate the comparable 
figure was 81.9%. On the-other hand, those who voted to override 
ranged from “liberal” to “conservative.” The definitely “conservative” 
(rating 6 or 7) composed only 43% of the overriding vote in the House, 
and 40.8% of the overriding vote in the Senate. In other words, clear- 
eut group differences emerge. Individuals who were rated as "liberals" 
voted overwhelmingly to sustain but substantial numbers of the “liberals” 
also voted to override, a “conservative” act. 

This distribution of the votes emphasizes the limited applicability 
of the rating for prediction of the voting behavior of the individual con- 
gressman for a single issue. 

In both the House and the Senate, however, the mean rating for those 
who voted to sustain was over 3 scale points lower (more *diberal") than 
the mean rating for those who voted to override. These differences, 
statistically significant in both cases far beyond the 17% level of confidence, 
are of the magnitude of almost half the total possible range. On a group 
basis, therefore, the Brimhall-Otis rating successfully distinguished be- 
tween the “liberals” and the “conservatives.” 

Finally, in both the House and the Senate, those congressmen rated 
by Brimhall and Otis who were not in the Eightieth Congress (did not 
stand for or lost the 1946 election) were as a group significantly more 
“liberal” than those who voted to override the veto (Table 2). In both 
cases this difference was about 2 scale points. These differences were 
also significant far beyond the 1% level of confidence. 


Philip Ash 


638 


"104 OY} шоду juosqu әләм вволапоју qyonqrg eq ш pus су-д Aq PHVI оувпоб eq jo влодшош олу риз esnog әчу JO влодшеш пој, , 


921 Y оза 
92€ 6£'£ urejsng 
—әризәлсу 
әәйәләрт 
76% arg 981 elt 88°F err — Buney тој 
0'00т I4 0001 6? ооо cc 0'00т 008 0'007 © 0'001 99 [=30], 
vs 9 са 9 »'6 86 єп 8 (suo) 2 
Let FI 995 и £76 2 LIS 92, 9 
66 L Us LU esl 9% T6 S$ S'I I g 
rr. gr 995 +I oF I 82122 est 98 SI I + 
Lec 6 са 9 98I- 8 OTI 2 ест 9 26 9 = 
691 а ГР 5 ср о v6 85 9 9t 0o £I с 
Sil 58 то 8 gel  9F 60 [4 219 РО (атт 
% “ох % сох % сом % ON % сох % ‘ON Эшен 
то, оршело ` ureysng тој, ƏPHIAQ urgjsng 
eug M,L волтувуповолдон Jo оспон 


„Запен ‚„,твтувАләвпогу-шввләдгТ,, 
uveyy Aq рив 0324 Ao[y1€H-3J9., Oy} uo 970A Aq SHO ров [equ Aq рејен ueurssoiguo;) jo поппаш ај 


T ІЧ: 


“Liberalism” in Voting on Taft-Hartley Act 639 


Table 2 


Distribution of Congressmen Rated by Brimhall and Otis Who Did Not Stand for or 
Were Not Reelected to the Eightieth Congress 


House of 


Representatives Senate 
Rating No. 96 No. 96 
1 42 37.2 5 33.3 
2 19 16.8 1 07 
3 12 10.6 2 13 
4 9 8.0 4 27 
5 14 124 1 0.7 
6 14 12.4 2 L3 
7 3 2.6 
Total 113 100.0 15 100.0 
Mean Rating 2.89 3.07 
Mean Rating of 
Override Group 
(from Table 1) 4.88 5.12 
Difference 
Override—Not Elected 1.99 2.05 
t-ratio 10.6 6.6 


a по 


This finding is at least partially corroborated by the fact, noted above, 
that the vole to sustain group contained a higher proportion of holdovers 
than the vote to override group. 


Summary and Conclusions 


Using ratings of “liberalism” developed by Brimhall and Otis, a com- 
parison was made of the differences in degree of “liberalism” between 
congressmen who voted to sustain the Taft-Hartley veto and those who 
voted to override. In addition, those who voted to override were com- 
pared with congressmen rated by Brimhall and Otis who were not in the 
Eightieth Congress, but were in previous Congresses. 

It was found that those who voted to sustain were significantly more 
“liberal” than those who voted to override. It was also found that those 
who failed to stand for election to, or failed in winning re-election to the 
Eightieth Congress were more “liberal” than those who voted to override. 

However, at least a small proportion of “liberals” actually voted 
"conservative" (i.e., to override) ; on the other hand, no "conservatives" 
voted to sustain the veto. This finding suggests that the "liberaltsm- 
conservatism” index as here developed yields less reliable predictions of 


640 Philip Ash 
individual voting behavior on single issues than of group voting b 
on such issues or individual voting behavior on a group of is 
This study, like the Brimhall and Otis study on which it draws ће 
has for a principal reason the exploration of consistency of pé 
behavior in the democratic setting. It suggests, in addition to thet 
of research outlined by Brimhall and Otis, the need for certain met 
logical studies. These would include a study of the validity ¢ 
“liberalism” index used, correlation analysis to determine the relis 
or consistency of voting behavior as measured by the index, am 
search to determine the parameters for an equation for individus 
diction. 
Received April 5, 1948. 


|| 


The Pin Prick Method of Secret Balloting 


Raymond J. Corsini 
San Quentin, California 


A common method to find out what a group really feels about issues 
which they ordinarily will conceal is to use a secret ballot, wherein the 
anonymity of the individual subjects is preserved. Most often a mimeo- 
graphed or printed sheet to which subjects need write only “уез” or “по,” 
or perhaps check a number of alternatives is used. 

The writer has found the use of the simple method described below 
to have some advantages over the pencil-paper method and he presents 
it for the consideration of those who at times feel the need to sample the 
attitudes of individuals who may fear disclosure or betrayal of authorship 
by possible identification of their check marks. 

'The ordinary type of question sheet is prepared and instead of pencils 
being used, ordinary straight pins, or preferably toothpicks, are dis- 
tributed with directions to push one hole at appropriate places if one 
wants to answer “уез” and two holes if one wants to answer "no". Or, 
if a list of alternate responses are listed, one merely punches а hole fol- 
lowing the appropriate response, or a sequence of holes if he is to list 
alternates in order of preference, etc. 

The advantages seem to be: џ 

1. The impression is gained that more people feel that this method is 
more secret than the use of pencils, since a pin hole is more anonymous 
than even pencil checks. 

2. The use of pencils is not required, which are often not possessed 
by some institutional individuals, and lending them out often results in 
not getting them back. : ; 

3. No writing surface is needed, во that this type of questionnaire can 
be used even when tables, one-armed chairs etc. are not available. 


Received May 6, 1948. 


641 


Distributions of Scores on the Wechsler-Bellevue Scales and 
the California Test of Mental Maturity at a 
V. А. Guidance Center 


May Herrmann and Roy B. Hackman 
Temple University 


The purpose of this paper is to report the results of an analysis of 
certain test scores of the 4500 veterans counseled at the Temple University 
Veterans’ Administration Guidance Center between September 1945 and 
November 1946. Of these 4500 veterans, 2289 were given the Wechsler- 
Bellevue Test and 571 were given the California Test of Mental Maturity. 
Staff Members at the Center were particularly interested in the results 
obtained on these tests of general mental ability. The question was 
raised in this Center as in others as to whether existing norms were ap- 
plicable to the veterans, or whether local norms should be made for use 
in veterans' advisement centers (1). 

The mental ability test most often administered at the Temple Uni- 
versity Veterans’ Administration Guidance Center is the Wechsler- 
Bellevue. Tabulations completed at City College of New York and the 
University of Michigan confirm the popularity of this test in veterans’ 
advisement units (2, 3). The reliability of the test has been the subject 
of other reports; Rabin concludes that the majority of studies show high 
correlations between the Wechsler-Bellevue and other individual and 
group measures of intelligence (4). He reports correlations with the 
Stanford-Binet (Form L) ranging from .62 in Anderson’s study of 112 
female college freshmen to .91 and .93 in Halpern’s and Benton’s studies 
of mental patients. He also quotes Lewinski's study of 290 “psycho 
pathic and subnormal" naval recruits showing r — .73 between the Kent- 
Emergency Test and the Verbal Scale of the Wechsler-Bellevue. Rabin’s 
own study of 92 student nurses on the Army Alpha 5 and the Wechsler- 
Bellevue results in r = .74. The lowest correlation reported was in 
Anderson's study of the ACE and the Wechsler-Bellevue obtaining r’s of 
-48 and .53. It was added, however, that the Verbal Scale correlation 
was higher than the full scale. Sartain’s study concurs with these results 
(5). He tested 50 college students in their freshman year and obtained 
correlations with Wechsler-Bellevue (1941 Edition) and other tests 28 

follows: Revised Alpha Examination (Form 5): .74; Otis Self Adminis- 
tering Test of Mental Ability (Form A): .70; ACE (1942 Edition): .69; 
642 


Wechsler-Belleeue Scales and California Test 643 


Stanford-Binet (Form L):.77. Watson’s article, which amplifies Rabin's, 
states that there are fairly high correlations between the Wechsler- 
Bellevue Scales and verbal measures of intelligence, but the correlations 
with performance type scales are somewhat lower, although still sub- 
stantial. He confirms the trend reported by Rabin of relatively higher 
W-B LQ.'s for duller subjects and relatively lower ones for brighter 
subjects (6). He describes Lewinski's studies of 100 Naval recruits 
suspected to be mentally retarded, showing correlations of .65 and .64 
between Scale A and Seale B of the Herring Revision of the Binet-Simon 
Tests and the W-B Verbal Scale. Goldfarb's study of 60 superior foster 
home children is also included with correlations of .86, .80 and .67 be- 
tween Stanford-Binet (Form L) and full, Verbal and Performance W-B 
Scales respectively. 
Procedures and Results 

The veterans in the group counseled at the Temple University Center 
were young; 48% between 20-24 years and 29% between 25-29 years. 
An analysis of the educational level showed that over half had completed 
high school, 44% having credit for 12 grades and 9% having more than a 
high school education; 15% had 8 grades or less and 32% had 9, 10, or 11 
grades. The group was 91% white and almost entirely male. 

The distribution of I.Q.’s obtained for 2289 cases at the Temple 
University Center is very similar to the test norms as published by 
Wechsler (7) (see Table 1). The mean I.Q. is 101.0, the standard devia- 


Table 1 
Distribution of Wechsler-Bellevue Total I.Q.’s 


No Disability Disability 
Rating for _ Rating for 
All Cases Psychoneurosis Psy 
IQ. f % { % f % 
50- 69 57 2.5 42 23 15 3.6 
70- 79 116 51 85 45 31 1.5 
80- 89 296 12.9 235 12.5 61 14.9 
90- 99 527 23.1 412 22.0 115 279 
100-109 656 28.6 552 294 104 25.3 
110-119 486 21.2 419 22.3 67 16.3 
120-129 141 6.2 122 6.5 19 4.6 
130-144 10 0.4 10 0.6 xd 29 
Мома uA OLEO I а И ГДЕ 
Total 2289 1877 412 
Mean 1.0 101.0 101.7 97.8 
* 
Standard 
Deviation 13.96 13.83 14.10 


644 May Herrmann and Roy B. Hackman 


tion 13.96, and half the cases (529%) have LQ.'s between 90 and 109. 
These results show this veteran group to be comparable to Wechsler's 
norm group. 

Of the 2289 veterans taking this test, 412 were receiving pensions for 
psychoneurotie disabilities. As is true for all the mental ability testa 
studied at this Center, results for the psychoneurotics as а group tended 
to be lower than those for the entire sample (see Table 1). For the 
Wechaler-Bellevue, the mean LQ. of the 412 psychoneurotics was 97.8; 
for the remainder of the distribution the mean LQ. was 101.7. The 
C. К. of 5.10 indicates that this is a statistically significant difference. 
From the data available, the reasons for the poorer showing of the psycho- 
neurotics could not be determined (ће., possible selective factors, ete.) 


Table 2 
Distribution of Wechsler-Bellevue Part-Test Scores 
No Disability Rating for Disability Rating for 
Preychoneurosis Paychoneurosis M 
Verbal Performance Verbal Performance 
19. LQ LQ. 1.9. 
LQ f % f % f % f 5 
ПИ Mt Е CIL4T еы NL. 
40- 69 а 2м 46 245 18- 43 M 34 
70- 79 100 581 65 346 7 90 20 48 
50- 59 250 1540 216 1150 74 179 з мі 
90- 99 40 2199 $4 isu u5 279 оз 226 
100-100 54 2598 522 па 101 245 128 310 
110-119 90 1646 497 2648 52 127 т 187 
120-129 102 54 168 895 13 32 20 49 
130-149 5 069 17 09 2 04 2 0.5 
CU ODD Uis --_- NN NN 
Total 1877 1877 412 412 
Mean LQ. 997 103.4 96.4 99.9 
8р. 13.08 14.37 14.14 14.16 


Part scores for the test were studied to determine whether such а 
breakdown would reveal differences between cases who have a disability 
rating for psychoneurosis and those who do not (see Table 2). Of the 
1877 cases without ratings for psychoneurosis, the Mean Verbal 1.0. 
was 99.7 and the Mean Performance LQ., 103.4. This compares with 
respective Means of 96.4 and 99.9 for the 412 psychoneurotics. The 
data for both groups, therefore, demonstrate superiority in the perfor- 
mance phase of the test. As previously mentioned, Rabin and Watson 
believe that the verbal scale correlates more highly with the traditional 
measures of intelligence and achievement than does the performance 


Wechaler- Bellevue. Seales and California Тем 


(4, 6). On this verbal eriterion, therefore, the total group involved 
this study is of slightly below average mental ability. Wechsler him- 
concludes that performance scores surpass verbal except for (иже of 
average intelligence (7). Не also classifies young paychopaths 
d behavior problems in the group scoring performance 
Levi and Weider are in agreement (8), Several investigators 


3649 non- 
paths) to the Federal Penitentiary at Lewisburg, Pa., concludes 
at intelligence is not a significant factor in the diagnosis of payehopathic 
ity (9). Brown, too, showed that for 
to the Illinois State prisons from 1930-1936 the average intelligence 
st scores were similar to those for the general population, but with 


ў somewhat higher on the performance scale and the mean difference 
the same order of magnitude (approximately 3.5 I.Q. points). 


646 May Herrmann and Roy B. Hackman 


Since the Wechsler-Bellevue was the test which was usually given to 
the literate veteran of limited schooling (less than ten grades), this 
weighting of the population at the lower educational levels probably 
contributed to the slightly below average verbal I.Q. obtained and the 
discrepancy in favor of performance scores. 

The California Test of Mental Maturity was administered to a dif- 
ferent sample of 571 cases (see Table 3). The median I.Q. obtained was 
109, the mean 1.0. 109.5 and the standard deviation 15 points. This 
distribution is much higher than that for the “normal” population (based 
on 100,000 persons) published in the Manual of Directions for the test 
(12); it exceeds also the published norms for twelfth grade students and 
approaches the published norms based on college freshmen (see Table 4). 


Table 4 


California Test of Mental Maturity: Comparison of Published Medians and Standard 
Deviations with Those Obtained at the Temple Center (1)—p. 19) 


Median S.D. 
EMEND. 9, = ш 
Normal Population I.Q.'s (N = 100,000) 100.0 16.0 
Ninth Grade I.Q.’s (N = 25,000) 101.5 15.5 
Tenth Grade 1.Q.’s (N = 25,000) 103.0 15.5 
Eleventh Grade I.Q.'s (N = 25,000) 104.0 15.5 
Twelfth Grade I.Q.’s (N = 25,000) 105.0 15.0 
Veterans at Temple Center I.Q.’s (N = 571) 109.0 15.0 
College Freshmen I.Q.’s (N = 15,000) 110.0 14.0 
College Graduate I.Q.’s (N = 2,000) 125.0 12.0 


It is noted that an L.Q. of 110 or higher was obtained for almost half 
(47%) of the 571 veterans taking the California Test of Mental Maturity 
at the Temple University Center, as opposed to 21% of the 2289 veterans 
taking the Wechsler-Bellevue. In a recent study, H. M. Hildreth, Chief 
Clinical Psychologist, V. A. Branch 12, analyzes the California Mental 
Maturity Test scores for 248 older veterans (not including any of World 
War II). On total test performance his group is classified as dull-normal 
(median I.Q. 85.4), despite the fact that Hildreth grants that the norms 
for this test are generally considered as too high. The age (median 57.6 
years) and education (median grade 8.5) factors probably contribute to 
the low scores obtained (13). 

At the Temple University Center it is recognized that a selective 

factor was in effect, since the California Test of Mental Maturity was 


Wechsler-Bellevue Scales and California Test 647 


seldom administered to veterans having less than ten grades of education. 
Furthermore, a large number of these veterans were contemplating 
college entrance and requested advisement under P. L. 346. It must be 
stressed that these С. I.’s (self-selected) would be expected to be above 
average individuals as opposed to P. L. 16 Rehabilitation cases (required 
to report for advisement) which should be a more representative group. 
The number of psychoneuroties given the California Test was not suffi- 
ciently large to warrant a separate study. 


Summary 


1. A study was made of the distributions of obtained I.Q.'s for one 
sample on the Wechsler-Bellevue Scale (N — 2289); and for а second 
sample on the California Test of Mental Maturity (N = 571). 

2. Local Wechsler-Bellevue norms computed for the group of veterans 
studied at the Temple University Veterans' Administration Guidance 
Center agree closely with the published norms, although veterans with 
a disability rating for psychoneurosis scored somewhat lower. , 

3. For the sample studied on the California Test of Mental Maturity, 
the I.Q.’s obtained appear to be somewhat higher than might be expected 
with a random sample of veterans. Contributing factors to the superior 
showing are the higher than average educational level of this group and 
its self-selected character. 


Received April 23, 1948. 


References 


1. Staff, Advisement and Guidance Service, V. A. Washington, D. C. The use of 
tests in the Veterans’ Administration Counseling Program. Educ. psychol. 
Measmt., 1946, 6, 17-23. 

2. Baker, Gertrude, and Peatman, John Gray. ‘Tests used in Veterans’ Administra- 
tion Advisement Units. Amer. Psychologist, 1947, 2, 99-102. 

3. Darley, John G., and Marquis, Donald G. Veterans’ Guidance Centers: A survey 
of their problems and activities. J. clin. Psychol., 1946, 2, 109-116. 

4. Rabin, Albert J. The use of the Wechsler-Bellevue Seales with the normal and 
abnormal person. Psychol. Bull., 1945, 42, 410-420. 

5. Sartain, А. Q. A comparison of the New Revised Stanford-Binet, the Bellevue 
Scale and certain group tests of intelligence. J. soc. Psychol., 1946, 23, 237-239. 

6. Watson, Robert I. The use of the Wechsler-Bellevue Seales: A supplement. Psy- 
chol. Bull., 1946, 43, 61-67. 

7. Wechsler, D. The measurement of adult intelligence. Baltimore: Williams and 
Wilkins Co., 1944. 

8. Weider, A., Levi, J., and Resch, F. Performance of problem children on the 
Wechsler-Bellevue Intelligence Scales and the Revised Stanford-Binet. Psych at. 
Quart., 1943, 17, 695-701. 


648 May Herrmann and Roy B. Hackman 


9. Gurvitz, Milton 8. The intelligence factor in psychopathic personality. J. elim. 
Psychol., 1947, 3, 194-196. 

10. Brown, A. W., and Hartman, A. А. A survey of the intelligence of Ilinois prisoners, 
J. crim. Law Criminol., 1938, 28, 707-719. 

11. Harris, Robert E., and Thompson, Clare Wright. The relation of emotional adjust- 
ment to intellectual function—A note. Psychol. Bull., 1947, 44, 283-287. 

12. Sullivan, Elizabeth T., Willis, W. Clark, and Tiegs, Ernest W. Manual of Direo 

а tions. California Test of Mental Maturity—Advanced Series. California Test 

Bureau, Los Angeles, Calif., 1946 Revision. 

13. Hildreth, H. M. The older veteran in the domiciliary home. Report by the 
Clinical Psychology Unit, Neuropsychiatry Section, V. A. Branch 12, March 
1947. 


и" — — 


^. 


C——————————Ááa 


The Effects of Eliminating Binocular and Peripheral 
Monocular Visual Cues upon Airplane 


Pilot Performance in Landing * 
Stanley N. Roscoe 
University of Illinois 


Throughout the history of modern scientific psychology the visual 
perception of relative position and movement in space has held a promi- 
nent place in experimental and theoretical literature (1). Even в0, our 
present. understanding of the subject is not adequate to solve all of the 
perceptual problems which arise in connection with the designing of air- 
planes which will carry human pilots faster than the speed of sound. 
This paper is concerned with some of the specific problems of depth and 
movement perception which present themselves when we consider the 
possibility that some of our future supersonic airplanes may be designed 
to be flown without direct outside visibility. 

In such airplanes position and movement information must necessarily 
be presented to the pilot exclusively by instrumental means. Either 
television or radar, or both, may be employed for this purpose. However, 
all such devices as yet developed (2) present a restricted visual field on a 
flat surface similar in appearance to a small motion picture screen. When- 
ever three dimensional space is represented on a two dimensional surface, 
each eye sees the same restricted image, thus eliminating binocular dis- 
parity and parallax and the crossed and uncrossed double images normally 
present in binocular vision. Furthermore, any such display would re- 
strict peripheral outside visibility. 

Considering the emphasis which has traditionally been placed on un- 
restricted binocular vision in the selection and training of pilots and in 
the design of equipment, one might suspect that the restrictions imposed 
by a small, flat visual field would result in a serious impairment of pilot 
performance in the flight situation. F ortunately there is no experimental 
evidence to indicate that such restrictions would make it impossible to 
fly an airplane. In fact, there is considerable evidence that the binocular 
visual cues are not particularly effective in the perception of spatial rela- 
tionships at such relatively great distances as are involved in the flight 

* This research was carried out under Contract N6ori-71, T. O. XVI, between the 
Special Devices Center, Office of Naval Research, and the University of Illinois. This 
paper is based upon Report No. 5 under that contract. The writer wishes to express 


his appreciation to Professor A. C. Williams, Jr., who directed the research. 
649 


650 Stanley У. Roscoe 


situation. Furthermore, the perception of spatial relationships while 
in flight is not an end in itself. The pilot's immediate task is to maneuver 
his plane through a desired path in relation to other objects in space, i.e., 
to move the controls properly, and it is possible that this can be done 
without the degree of perceptual discrimination afforded by unrestricted 
binocular vision. 


* The Problem 


Considering these faets, the present experiment was designed to in- 
vestigate the effects of eliminating the binocular visual cues in the flight 
situation and of restricting the angular range of visibility, which neces- 
sarily occurs when a visual field is presented on a small, flat surface. 
Two questions to be answered by the experiment were: (1) Can successful 
flights be made at all under the above mentioned conditions of restricted 
visibility, and (2) if suecessful flights can be made at all, what will be the 
quality of the pilot performance. 


Description of the Experiment 


In order to determine the effects of various conditions of visibility upon 
pilot performance in a specific flight maneuver, the experiment was designed 
to measure and contrast the accuracy with which experienced pilots make 
"spot" landings in two experimental situations and one control situation: 

Situation *A"— (experimental) the restriction of the outside visual field to 
а {ое area described by horizontal and vertical angles of approximately 
10 degrees each and the.elimination of binocular visual cues by the use of а 
projection periscope image cast on a ground glass screen. M 

Situation *'B"— (experimental) a similar restriction of the range of visibility 
by the use of vision reducing goggles and a vision directing screen, but without 
the elimination of binocular visual cues. 

Situation “C”—(control) unrestricted binocular visibility in the normal 
contact flight situation, 

The experimental flights were conducted in a modified Cessna T-50 air- 
lane. The task was to make an approach to a landing from straight and 
evel flight at an altitude of 800 feet and at a distance of more than one and 
one-half miles from the end of the landing runway. This task was selected as 
the one flight maneuver generally rog d by pilots and pilot instructors to 
require the greatest iy and movement discrimination. 4 

_ 4, _ The criterion for the goodness of pilot performance was the accuracy of 
the landing touchdowns in relation to a designated landing "spot," since 
Walker et al (5) found this to be the best single measure of the overall per- 
formance of experienced pilots performing this specific task. The desired 
landing "spot" was defined by Mito target panels placed on each side of the 
runway at ap roximately one-fourth of the way along the landing strip. . 

Situation “A”—The projection periscope was selected as the most practical 
and economical device for presenting the visual field on a small two dimensional 
surface. This instrument cast an image on a ground glass screen without 
markedly reducing luminosity, clarity, or color. With respect to the elimina- 

д Woodworth (6, p. 680) concludes: “Except for ‘close work,’ the manipulation of 
aano Mem before the eyes, the binocular cues are probably less important than 
covering, shading, and the different kinds of perspective." 


Airplane Pilot Performance in. Landing 651 


оп of binocular disparity, parallax and crossed and unerossed double images, 
is device was effectively equivalent to television. Only the so-called monoc- 
аг visual cues were present, and they were restricted to central vision. They 
led: (1) shading: both “cast” and “attached” shadows, (2) the sequence 
ects in space, i.c., the partial covering of far objects by near, (3) four 
pes of perspective: (а) linear or angular, (b) detail, (c) aerial: the partial 
of object color of far objects, and (d) movement: the slower apparent 
jovement of distant objects. 
Detail perspective, while present, was somewhat distorted by the differ- 
ial clearness of focus of the periscopic image for objects at different distances. 
d movement parallax, although a monocular visual cue, is not effective 
viewing an image cast on a small flat surface. The non-visual cues of 
mmodation, convergence, and change in the pupil were probably ineffective 
апу of the three situations due to the relatively great distances involved. 
wever, they were definitely eliminated by the periscope, since all objects in 
visual field, that is, in the image on the screen, were presented at the same 
listance from the subject’s eyes, no matter what their actual distances from 
i might be. 
_ Situation '*B"—Since the projection periscope, and also teleyision, of neces- 
Шу presents a single restricted visual field, it might be expected that the 
lots’ performances would be affected hy the elimination of peripheral visual 
(4) and head movement parallax which are supposedly of great value to 
e pilot in landing an airplane. Therefore à second combination experi- 
intal-control situation (“В”) was included in which the visual field was 
ricted to the same extent as in the case of the periscopic HAN but without 
elimination of binocular depth effects. This was accomplished by the use 
vision reducing goggles and a vision directing screen to be described later. 
head movement parallax was reduced by instructing the subjects not to 
move their heads in order to obtain successive views of different outside areas 
through the vision directing screen (3). 
Situation “C”’—Landing performance while flying with unrestricted binocu- 
visibility was used as a control. By comparing pilot performances in 
‘situations “A” and “В” with each other and with performance in the control 
ation (“C”), measures were obtained indicating what effects on pilot per- 
formance can be attributed to the elimination of ripheral monoeular visual 
ез pad head movement parallax and what to the elimination of binocular 
sual cues. 


Description of the Apparatus 


emoval of the periscope. A 

The Periscope. The projection type periscope (see Figure 1) constructed 
or use in situation “A” cast a television-like image on а two dimensional 
ground glass surface. The lens used was three and three-quarter inches in 
ameter and of 30 inch focal length. Since а lens casts an inverted and 
reversed image, four mirrors were used in the periscopie system (see schematic 
гаш in Figure 2) in order to present а correctly oriented image. First 
surface type Panchronized mirrors were used. They were six by six inches 
Square. Each mirror was placed at an angle of 45 degrees to the principal 
axis of its incident rays. The lens was placed in the system between the first 
and second mirrors at a distance of 30 inches from the screen upon which the 


652 Stanley М. Roscoe 


Fig. 1. Subject's cockpit for Situation “A” showing periscope, hood and windshield 
eutouts. The ground glass screen is shown above control wheel. 


image focused. The screen was six by six inches square and was placed реге 
pendicular to the principal axis of its incident rays. The instrument was in- 
stalled so that the image was perpendicular to the subject’s line of sight. The 
image included a range of outside visibility described by a horizontal angle of 
10° 40’ and a vertical angle of 11° 50’. з 

The Vision Reducing Goggles. А special vision reducing eyepiece was con- 
structed for use in situation *B." A common laboratory type tubular vision 


Incident 
Light Mirror 
му EN 
: louble- 


Mirror 


Fra. 2. Schematic diagram of the projection periscopic system. 


Ae c m o —»— 


Airplane Pilot Performance in Landing 653 


reducer (see Figure 3) was modified so as to restrict the visual field of each eye 
individually to the same angular range of visibility (10° 40" horizontal and 
11° 50’ vertical) as presented by the periscopic image in situation “А.” This 
was done by inserting an opaque screen in the vision reduction tube of the 
eyepiece perpendicular to the subject's line of sight and approximately one 
inch from the subject's eyes. Two slots, each ү, of an inch аһ and у, of an 
inch wide and separated by the horizontal distance of 2% inches, were cut in 
the screen so as to present an opening in front of each eye. Over each of these 
slots was fitted an individual movable slide with an aperture approximately 
і of an inch square (see Figure 3). These slides were adjustable Ку 
so as to compensate for the variability in the distances between the eyes of 
different subjects, It was found experimentally that these openings of approxi- 
mately } of an inch square at а distance of approximately one inch from the 
subject's eyes allowed each eye the desired range of visibility, thus presenting 
central binocular vision of the same visual angle (both horizontally and ver- 
tically) as that presented on a flat surface by the periscopic image. 


Fig. 3. Vision reducers as modified for use in Situation “B” showing {'' square 
apertures in adjustable slides in front of each eye. 


. In order to prevent the subject (in situation “B”) from obtaining successive 
images of more than one outside area by moving his head and also to restrict 
head movement parallax, a metal screen, made of sheet aluminum and painted 
black, was АС to the top of the instrument panel perpendicular to the 
subject’s line of sight and covering the area of his front windshield. A гес- 
tangular aperture, 4.145 inches high by 6.234 inches wide, was cut in the screen 
80 as to allow the subject binocular vision of only one outside area directly 
forward from the airplane (see Figure 4). The area seen was the same as that 
presented by the periscopic image. The size of this aperture was determined 
trigonometrically, using 20 inches as the approximate average distance from 
the subject’s eyes to the screen and allowing for the average distance of approxi- 
mately 2.5 inches between human eyes. 


654 Stanley М. Roscoe 


Ета. 4. Subject’s cockpit for Situation “В” showing vision directing вегееп, 
cockpit hood and vision reducing goggles worn by subject. 


The Cockpit Hood. A hood was constructed of black “Leatherette” for use 
in situations “A” and “B.” It was attached around the subject pilot’s cockpit 
by a series of snap fasteners so as to prevent the subject from seeing out either 
to his right or left or to the rear. For situation “А” the windshields in front 
of the subject were covered by cardboard eutouts which were held in place by 
masking tape (see Figure 1). In addition to preventing any direct outside 
visibility, this combination of hood and windshield masks darkened the interior 
of the cockpit to such a degree that a clear, bright periscopie image could be 
seen while permitting enough light within the cockpit to allow the subject to 
use the other flight instruments (see Figure 1). 

The Landing Targets. Two portable landing target panels, each three feet 
high and fifteen feet wide, were constructed of white *Linene" cloth attached 
to a wooden framework. One panel was placed at each side of the runway 
opposite the landing “spot.” They were inclined at an angle of approximately 
45 degrees to the surface of the runway so as to stand out more clearly in the 
vistial field during the landing approach. 


Subjects 

Six subjects were tested in the experiment. All were flight instructors 
at the University of Illinois Institute of Aeronautics. Each was a 
qualified commercial pilot with both single and multi-engine ratings as 
well as C.A.A. Instrument and Instructor ratings. Each had at least 
1750 hours of flying time at the beginning of the experiment. Each had 
flown more than one multi-engine airplane and was familiar with the 
particular airplane used in the experiment. None of the subjects had 
previously participated in any experiment of this kind. 


Airplane Pilot Performance in Landing 655 


Тће Experimental Design 


Each of the six subjects was tested in each of the three experimental 
situations (periscope, goggles, contact), thus requiring eighteen separate 
experimental periods to complete the three series. The task was per- 
formed five successive times during each experimental period, ninety 
landing approaches being made in all, thirty in each of the situations. 

In order to balance any possible practice effects or transfer from one 
situation to another, a different sequence of situations was presented to 
each of the six subjects. By this arrangement, two subjects performed 
their first, second, and third series of trials in each of the three experi- 
mental situations, thus equally distributing among the three situations 
any advantage which might result from previous practice in another 
situation. 


Procedure 


The take-offs and the first three legs of the rectangular patterns were flown 
by the safety pilot. The wheels were not retracted and the landing flaps were 
not used. ‘The safety pilot made the final turn onto the landing approach. 
When the airplane was lined up with the runway at an indicated altitude of 
1550 feet (800 feet above the field elevation) and an indicated air speed of 90 
miles per hour, the safety pilot signaled the subject to take the controls by 
wiggling the ailerons. As the subject reduced power for the glide, the safety 
pilot trimmed the plane for a 90 mile an hour gliding speed. The subject was 
instructed to try to maintain that air speed and control the descent with power. 
He was allowed to make any type of landing, but it was recommended that he 
make tail high front wheel landings using a little power, as the runway was not 
visible on the periscope screen with the plane in the three-point attitude. The 
safety pilot was prepared to take over in the event the subject was unable to 
make a safe landing approach, but otherwise did not touch the controls except 
to avoid other airplanes. As soon as the subject had made a touchdown, the 
safety pilot took over again and made a follow-through takeoff for the next 
trial. At the beginning of each experimental series for each subject, the safety 
pilot made one demonstration landing so that the subject could observe how 
much his visibility would be restricted in the particular situation and what he 
could expect to see during the landing approach. Before any trials were made 
in situation “A,” the safety pilot pointed out, during his demonstration landing, 
the position of the runway image on the periscope screen during the approac 
glide and how this position changed during the flare out to the level flight 
attitude just before landing. Before any trials were made in either situation 
“А” or “B,” it was explained to the subject that in the event of a cross-wind 
any drift correction would have to be made by ME rather than by crabbing, 
80 as to keep the runway within the range of visi Шу. | б 

‘An assistant was stationed on the ground just opposite the landing "spot" 
to mark and record the first point of touchdown. The paved runways hap- 

ened to be conveniently divided into sections with expansion joints at twenty 
oot intervals. Also the survey stationing was marked in the paving every 
100 feet. Ву referring to these stations it was possible to record the errors in 
the accuracy of the landings to the nearest foot, as shown by the tire marks 
eft on the runway after landings. i 

While no specific weather condition requirements were set up, no flights 
were made in which the turbulence of the air or the direction and velocity of 


656 Stanley У. Roscoe 


the wind noticeably affected the accuracy of the subjects’ performances. No 
flights were made during times when the sky was overcast, thus 
maximum luminosity of the periscopic image in situation "А." 


Results 


The first question to be answered by the experiment was whether exe | 
perienced pilots could make successful landing approaches at all using | 
only those outside visual cues presented by a projected periscopic image, — 
Of a total of thirty landings made with the periscope, the pilots completed 
all but two without assistance from the safety pilot. On first trials alone, 
five of the six pilots completed their approaches without assistance. 
Both unsuccessful approaches (one a first trial and one a third trial) 
occurred because the pilots failed to correct for wind drift, allowing the ч 
image of the runway to pass out of view on the periscope screen. Appar- 
ently these errors did not arise from difficulties with depth perception, for 
when the plane was returned to the proper glide path by the safety pilot, 
the subjects were able to complete the landings without further assistance. 

Most pilots normally use a combination of “slip” and “crab” to correct 
for wind drift during contact landing approaches, and this technique 
was ineffective with the angular visual limitations imposed by the peri- 
scope. Drift correction with the periscope required a special straight 
ahead “slipping” technique which was adopted by all the pilots and em- 
ployed successfully when necessary on all approaches but the two men- 
tioned above. 

Since successful landings were made with the periscope, it would be 
expected that they could also be made with the goggles. This proved 
to be the case, as all landings made with the goggles and the vision 
directing screen were completed without assistance. P 

The safety of the landings, both those made with the periscope and 
with the goggles, was judged acceptable? by the safety pilot. These 
judgments included a consideration of the initial approach, the “flareouty 
and the actual touchdown. The pilots reported confidence in determining - 
the proper time to flare out the approach for the landing touchdown, and | 
although they frequently “skipped” or bounced on landings (which they 
seldom did with unrestricted vision), they did not lose control of the 4 
airplane when the wheels touched the ground. | 

The second question, to be answered in the event safe landings could | 
be made at all with the periscope and with the goggles, involved an ob- | 
jective evaluation of the quality or goodness of the pilot performance. 
The criterion for this objective rating was the accuracy of the landing | 

*In pilot vernacular, a “good” landing is “апу one you can walk away from.” - 
"These experimental landings were ones you could fly away from. 


E 


Airplane Pilot Performance in Landing 657 


touchdowns. The raw error scores in feet from the landing “spot” for 
each of the ninety landings performed by the six pilots are shown in 
Table 1. An inspection of the table reveals that the most accurate land- 


Table 1 
Tabulation of Raw Data 
The errors in feet for each of the ninety landing approaches made in the experiment 
are listed according to Subject and Situation. Each group of five scores represents 
the five approaches made by one subject in one situation during one experimental 
period. "The scores are listed in the same order in which the trials were performed 
during that period. The numbers in parentheses indicate the order in which each sub- 
ject was tested in each of the three situations. 
=———ЄЄ—Є——————Є——————————— 
Situation “A” Situation “B” Situation “С” 


Subject Trial (Periscope) (Goggles) (Contact) 

1 —315 +474 -174 

2 — 308 + 6 = 7 
np 3 —212 —223 —101 

4 —232 —230 + 9 

5 +489 (1) —110 (2) — 32 (3) 

1 4-300* T275 +370 

2 —270 0 – 70 
P. 3 +302 – 70 +100 

4 —286 + 60 +163 

5 +500 (1) + 40 (3) — 26 (2) 

1 — 620 + 20 — 44 

2 + 70 — 10 +17 
“3” 3 + 50 60 + 58 

4 340 —135 = 10 

5 +510 (2) —120 (1) + 50 (3) 

1 —402 —406 — 22 

2 —255 —159 – 30 
Mae 3 —280 +810 — 32 

4 +210 —178 + 52 

5 —197 (3) +190 (1) — 62(2) 

1 — 23 +190 +180 

2 +210 OF; —380 
tgn 3 pd 50 —280 +155 

4 338 —170 37 

5 —188 (2) —302 (3) —117 (1) 

1 +170 + 31 —168 

2 + 10 Ta — 40 
С? 3 — 20* + 15 = 17 

4 + 42 +150 — 15 

5 —355 (3) — 40 (2) + 30 (1) 

E  — “ЖЕЛИШ Ре аса uic i Se А t 
= * Indicates approaches in which the subject would not have made a successful 
landing without assistance from the safety pilot. 


658 Stanley М. Roscoe 


ings were made with unrestricted visibility, while the least accurate land- — 
ings were made with the periscope. Table 2 shows the average or mean 
errors (signs disregarded) for the thirty landings made in each of the 
three situations. The average error for contact landings was 85.1 feet 
from the landing "spot," for goggle landings 142.4 feet, and for periscope 


Table 2 
Summary of the Results that Obtain when Performance Scores of the Six Pilots Are 
Pooled and Measures of Central Tendency and Variability Are Computed 
The constant errors, average errors, standard deviations, and the standard errors of 
the means (average errors) for the thirty landing approaches made in each of the three 
Situations. Values are expressed in feet. 


Situation “А” Situation “B” Situation "C" 


Constant Error = 157 — 19.8 – 40 
Average or Mean Error 251.8 1424 85.1 
Standard Deviation 155.7 124.5 94.0 
Standard Error of the Mean 28.9 23.1 17.4 


landings 251.8 feet. The difference between the average accuracy of the 
landings made with the periscope and those made in the control situation 
was significant at the 1% level. The difference in the accuracy of the 
landings made with the periscope and with the goggles was also significant 
at the 1% level. The difference between goggle landings and contact 


Table 3 


Significance of the Differences between Group Mean Performances in 
Different Experimental Situations 
Product-moment correlation coefficients between the raw error scores (signs dis- 
regarded) of the individual landings of all subjects in the three situations,* differences 
between group means for the different situations, standard errors of the differences 
between the correlated means, critical ratios, and the level of significance of the differ- 
ences between means as determined by Fisher’s Test. 


a 


Situations “A” and “В” Situations “А” and “С” Situations “В” and “С” 
отт es Sia ere ae eo M PM eee 
TAB = .033 rac = —.132 твс = .136 
Ma — Мв = 109.4 Ma — Mc = 166.7 Мв — Mo = 57.3 
сан = 36.4 сан = 35.6 сан = 27.0 
СЕ = 3.01 СЕ = 4.68 СЕ = 2.12 
Difference significant Difference significant Difference significant 
at 1% level at 1% level at 5%, not at 1% level 


* The scores were matched for correlation as they appear in Table 1. Thus each. 
subjects 18%, 2nd, 3rd, 4th, and 5th landings in each situation were matched with his 
corresponding landings in each of the other situations. 


Airplane Pilot Performance in Landing 659 


landings was significant at the 5% level but not at the 1% level. Before 
using the massed data to make these above comparisons, the error scores 
for the individual trials of the six subjects in the three different situations 
were correlated. The coefficients were all insignificant. 

A further result, incidental to the purpose of the experiment and 
provided for in the design, appeared from the analysis of practice effects. 
An inspection of the raw error scores in Table 1 reveals the absence of 
any intra-serial practice effects either with the goggles or the periscope. 
Apparently five trials made in rapid succession in the new situations had 
as much fatigue effect as practice effect. The subjects reported the 
new instruments to be most exacting. To test for inter-serial gains, or 
transfer-practice effects resulting from practice on the same task under 
different conditions, the raw error scores from Table 1 were converted to 
standard scores and rearranged according to whether they were performed 
during each subject’s first, second, or third experimental period, regard- 
less of the situation in which the subject was tested. The results of the 
various computations are not shown, but it was found that there was 
consistent improvement, the improvement from the first periods to the 
third periods being significant at the 5% level. 


Discussion 

Since there were no similar or directly related studies upon which pre- 
dictions could be based for the present investigation, the experiment was 
oriented as a direct attack at the applied rather than the pure problem 
of space and movement perception in the flight situation. It was not 
known whether pilots could make landings at all with the outside visual 
field presented on a small, flat surface. The results demonstrated that 
this could be done. It remains to discuss what further inferences can 
be drawn from the accuracy of the performances in the three experimental 
situations. 

If it were true that binocular depth effects contribute little to space 
and movement perception at such relatively great distances, it would be 
expected that landing accuracy would be affected approximately the same 
amount by the goggles as by the periscope. This was not the case. This 
may be interpreted in three possible ways. First, it is possible that the 
binocular depth cues as such are of significant importance. This could 
easily be tested in another experiment simply by covering one eye and 
comparing the accuracy of the landings made in this condition with the 
results from situation “C.” It is doubtful that the difference would be 
Significant. 3 

А second interpretation would postulate that it is the amount rather 
than the kind of visual restriction imposed that determines the accuracy 


660 Stanley N. Roscoe 


of the performance, i.e., the more cues taken away, the greater the errors 
in landing accuracy. Thus when the binocular cues are eliminated in 
addition to restricting the angular range of visibility, as was done by the 
periscope, errors become significantly greater than when only the angular 
range is restricted. 

Тће third possibility is that the apparatus used did not achieve an 
equivalent restriction of peripheral vision in the two experimental situa- 
tions as it was designed to do. The restrictions imposed by the periscope 
were definitely known. ‘The design and use of the goggles rendered the 
accurate determination of the true amount and nature of their restricting 
effects most difficult. A consideration of the techniques employed by 
the pilots in the various situations reveals a possible source of error for 
the results obtained with the goggles. ` 

While flying with unrestricted visibility, the pilots were able to keep 
the landing targets in view at all times until the landing had been made. 
They reported this cue most effective in making accurate landings in 
situation “C.” While flying with the periscope or with the goggles, 
the targets, which were of necessity placed at the sides of the runway, 
passed out of the visual field while the airplane was still several hundred 
feet from the landing “spot.” When flying with the periscope the sub- 
jects had no control over this, but with the goggles they could do some- 
thing about it. Although they could not increase their angular range of 
visibility for any given moment, they could, by shifting their position in 
the cockpit, change the area of visibility which they saw through the 
vision directing screen. By simply leaning forward and slightly to the 
right, they could keep the target on the left side of the runway in sight 
for a few seconds—and several hundred feet—longer than it could be 
seen with the periscope. The subjects were instructed not to shift their 
position for this purpose, and while they did not do so consciously or 
intentionally, according to their reports, the additional advantage was 
available to them, and it is possible that they did make use of it. Also it 
is suspected that this may have accounted for the absence of serious 
errors in drift correction in situation “B.” By leaning to the upwind 
side of the cockpit, the runway could be kept in sight through the vision 
directing screen even though the airplane were slightly crabbed. 

That this was possible represents a fault in the experimental procedure 
which can only be corrected in later investigations. The validity of the 
results for situation “B” is to be suspected. If the true difference be- 
tween the accuracy of performances in situations “В” and “С” were 
greater than it appears, the true difference between performances in 
situations “A” and “B” would be correspondingly less, indicating the 
greater relative importance of the peripheral monocular visual cues in 
the flight situation. 


Airplane Pilot Performance in Landing 661 


The present investigation demonstrated that pilots can make success- 
ful approaches to landings both with a simple angular restriction of the 
peripheral visual field and with a similar angular restriction of outside 
visibility plus the effective elimination of the binocular cues of depth 
and movement. The significant differences in the accuracy of the pilots’ 
performances while flying with the periscope, with the vision restricting 
goggles and vision directing screen, and in the control condition of un- 
restricted visibility, indicate both of these groups of visual depth and 
movement cues to be important in the accurate landing of an airplane. 
While the present results suggest the binocular to be relatively more im- 
portant than the peripheral monocular cues, the validity of these partic- 
ular results is to be questioned in view of the unforeseen procedural 
difficulties encountered with the goggles and the vision directing screen. 

: However, the important conclusion to be drawn from this investigation is 
that pilots are successful in making use of whatever outside visual cues 
are presented to them while making airplane landing approaches. 


Summary 

Six instrument pilots were tested for accuracy of landing under these 
conditions of visibility in a Cessna T-50 aircraft: 

Condition A. Outside vision restricted to an image cast on a ground 
glass screen by a projection periscope. The image confined the range of 
outside visibility to a visual angle of approximately 10 degrees, both 
horizontally and vertically, and binocular cues of depth and movement 
were eliminated. 

Condition B. Similar restriction of the outside visual field achieved 
by use of vision reducing goggles and a vision directing screen. Binocular 
cues were present. 

Condition C. Unrestricted outside visibility as in normal contact 
flight. This situation served as control. 

Every pilot made five landings under each condition, ninety landings 
being made in all. Conditions were rotated among pilots to balance 
practice effects. 

Safe approaches and landings were made by all pilots in all conditions. 
Landings were most accurately made in Condition С, control, where the 
average error was 85.1 feet from the landing "spot" (sign of error dis- 
regarded). The average error for landings in Condition B (vision reducing 
goggles) was 142.4 feet. Least accurate landings were made in Condition 
A (periscope) where the average error was 251.8 feet. These differences 
were found to be significant in each case, but the difference between 
periscope and goggles may be exaggerated as a result of certain unforeseen 
procedural difficulties which were uncontrolled. 


Received April 8, 1948. 


662 Stanley М. Roscoe 


References 


l. Boring, E. G. Sensation and perception in the history of experimental psychology. 
New York: Appleton-Century, 1942, 263-311. 

2. Pi Süner, А. The third dimension in the projection of motion pictures, Amer. J. 
Psychol., 1947, 60, 116-118. 

3. Tiffin, J., and Bromer, J. Analysis of eye fixations and patterns of eye movement in 
landing a Piper Cub J-3 airplane. Washington, D. C.: C.A.A., Division of 
Research, Report No. 10, February, 1943. 

4. Tinker, М. A., and Carlson, W. 5. Sensitivity of peripheral vision in relation to skill 
in landing an airplane. Washington, D. C.: C.A.A., Division of Research, 
Report No. 14, April, 1943. 

5. Walker, R. W., Bennett, S. V., and Ewart, E. S. А study of individual differences 
among flight instructors in making spot landings. Washington, D. C.: C.A.A. 
Division of Research, Report No. 56, February, 1946. 


6. Woodworth, R. S. Experimental psychology. New York: Henry Holt and Co., 1945, 
651-683. 


Prediction of Маје Readership of Magazine Articles * 


Evelyn Perloft** 
Ohio State University 


The purpose of this study was to determine the way in which five 
variables combined for maximum readership of articles in The Saturday 
Evening Post. The study dealt with the reactions of men only. The 
readership likes and dislikes of women will form a separate study. Since 
the ultimate objective was to predict, prior to publication, how many 
male readers would start to read the published articles, the multiple re- 
gression technique was used. 


The Data 


The Articles. The present study was limited to articles which ap- 
peared in The Saturday Evening Post throughout 1946. An "article" 
was a non-fiction item that did not appear regularly, contained 10% or 
more text, and was not an editorial. There were 190 articles included 
in the study. They represented about 50% of all articles which appeared 
in the Post in 1946. The articles used were those appearing in issues of 
the Post on which readership surveys had been made. Questions were 
asked to determine the number of men who saw, started, and finished 
each item in the issue. Since the primary concern of the study was the 
article’s power to attract male readers, the criterion selected was the per- 
centage of men who saw the article and started to read it. Such per- 
centage figures will be referred to hereafter as the "starting readership 
per cent." 

The Variables. Five variables were included in the study. These 
variables, although not necessarily the best determinants of starting 
readership per cent, were available in present records and it was believed 
that they might possess some predictive significance. The five variables 
used in the present study are primarily of interest to editors responsible 
for layout. The variables were, number of illustrations, color of illustra- 
tions, вел of persons in illustrations, proportion of opening page(s) devoted 
lo text, and subject matter of the article. 

* This study was conducted while the writer was a research associate in the Develop- 
Ment Divison of the Research Department, The Curtis Publishing Company. 

**The author wishes to express her grateful appreciation to Mr. Herbert C. Ludeke, 
Manager, Development Division, Curtis Publishing Co., and to Mr. Richard Gaylord 
and Dr. Hubert Brogden, The Adjutant General’s Office, who offered many useful criti- 
cisms during the research and in reading the manuscript. 

663 


664 Evelyn Perloff 


Number of illustrations was merely a count of the number of pictures 
allotted to the article. Color of illustrations consisted of the following 
three classes: full-color, duotone (usually gives the impression of black 
and white with over-all color tint) and black and white. Sex of persons 
in illustrations also had three classes: males only, females only, or both 
males and females. Proportion of opening page(s) devoted to text was 
classified into three groups: (1) articles with less than 20% of opening 
page(s) given to text, (2) articles having 20% to 40% opening page(s) 
in text, and (3) articles with 40% and over of the opening page(s) devoted 
to text. The categories of the subject matter variable were based on a 
classification system (viz., business, war and peace, literature and the 
arts, recreation, etc.) which is a modification of the one proposed by 
Waples and Tyler. An article was classified on the basis of its title, sub- 
title, illustrations, and captions only. The text of the article was not 
considered in classifying the article. 


The Procedure 


Since three variables were qualitative, it was necessary to assign 
numerical values to the various classes (within the variables) in order to 
handle the data quantitatively. The first step in determining the numer- 
ical value of a given class was to calculate the mean starting readership 
per cent (criterion) for all classes. The mean starting readership (per 
cent of readers seeing and starting the article) of a class was considered 
the best indicator of its predictive potency. If, for example, articles 
containing five illustrations had a mean starting readership per cent 
that was substantially higher than the mean starting readership per cent 
of those articles with only two illustrations, then it seemed reasonable to 
give the five illustrations category more weight than the two illustrations 
category. To weight classes in their relative importance to the starting 
readership per cent implied that the higher mean criterion score should 
receive the higher numerical value. This was the practice adhered to in 
the assignment of code numbers. 

Having coded the variables in terms of mean starting readership per 
cent, our next step was to determine the degree to which they were relate 
to the criterion and to each other. Computation of all 15 intercorrela- 
tions provided the necessary data. All correlation coefficients Were 
Pearson Product Moment. 

Since it was found that so many of the variables were related to each 
other, it would be impossible to determine their individual contribution 
to the success of an article, although this is highly desirable. With the 
multiple correlation procedure, however, an approach to the solution of 

1 Waples, D., and Tyler, R. W., What People Want to Read About, pp. 224-241, 1931. 


Prediction“of Male Readership of Magazine Articles 665 


this problem сап be reached by examination of the relative effect of each 
of these variables in the composite effect on starting readership. At the 
same time the over-all value of this set of variables in the optimum pre- 
diction of starting readership can only be evaluated by the multiple cor- 
relation procedure. The regression equation for predicting purposes was 
obtained and used to predict the starting readership of articles appearing 
in issues of The Saturday Evening Post during another year, 1947. This 
provided an indication of the validity of the present technique in pre- 
dicting the starting readership per cents of future Post articles. 


The Results and Discussion 


The findings will be presented in three sections: (1) The Distributions; 
(2) The Determination of the Composite Effect; and (3) The Cross- 
validation. 

The Distributions. Figures 1-5 show the distributions (in bar chart 
form) of starting readership per cent for each of the five variables. The 
left-hand column of each Figure indicates the various levels of readership. 
The characteristics of the articles are shown across the bottom of the 
graph. The number of articles for each{class of a variable is shown in 
the appropriate column. All starting readership per cents referred to in 
the article are indexes and not actual figures. 

The ultimate value of each of the variables studied cannot be deter- 
mined until they are examined, holding all others constant, because of 
the relationship among them. It is, for instance, not yet known whether 
less than 20% or between 20-39% of the opening page(s) devoted to 
text is better. Perhaps the better articles are set up with less than 20% 
opening text because of the belief that the opening page(s) of the more im- 
portant articles should contain little text. Any conclusions concerning 
the ultimate value of a particular variable are those that one would draw 
in considering this variable as the only one changing. Unfortunately, 
however, as will be later pointed out, any one variable is related to others 
and practical conclusions from these results may or may not exist. 

Figure 1 shows the distributions (in bar chart form) of starting reader- 
ship per cent and the number of illustrations. 

The relationship (correlation coefficient = .35) of number of illustrations 
to starting readership per cent indicates on the face of it that number of 
illustrations significantly influences the male reader in starting an article. 
In examining Figure 1 it will be apparent that there are two definite 
breaks. "Thus, there are sharp changes from two illustrations and below 
to three and above and from five illustrations to six and above. The 
general trend is for starting readership to improve with the number of 


illustrations. 


666 Evelyn Perloff 


STARTING READERSEIP $ INDEX 


NUMBER OF ILLUSTRATIONS IN ARTICLE 


Ета. 1. The effect of number of illustrations upon starting readership per cent 
(N 219) (т = .35). 


Тће distributions of color of illustrations and starting readership per 
cent are given in Figure 2. 

There appears to be a definite relationship (correlation coefficient =.28) 
between the amount of color and starting the article. There are no 
breaks in the mean starting readership values for the classes in color of 
illustrations as there are for the variable, number of illustrations. ‘There 
is an increase in the starting readership per cent progressively from black 


100 


8 


N*71 


5 


STARTING READERSEIP $ INDEX 
5 


о 


Other Black Duo- Full- 
& tone Color 
White 


COLOR OF ILLUSTRATIONS 


Fie. 2. The effect of color of illustrations upon starting readership per cent 
(N = 190) (г = .28). 


Prediction of Male Readership ој Magazine Articles 667 


and white to full-color. The distribution for the category, “other,” is 
widely and evenly spread in its variation and, therefore, no conclusions 
can be drawn. “Other” includes articles having no illustrations or having 
illustrations in two-color or in black and white plus color. 

Тће influence of sez of persons in illustrations and starting readership 
is shown in Figure 3. 


100 
E 
R 
M ө 
| 
р 
E 4o 
о 
8 20 N=7 | | №88] |М=86 


Female Male Male No 
& Data 
Female 


SEX OF PERSONS IN ILLUSTRATIONS 


Ела. 3. The effect of sex of persons in illustrations upon starting readership per cent 
(N = 190) (т = .22). 


Тће relationship (correlation coefficient = .22) between sex of persons 
in illustrations and starting readership also appears to be of some signifi- 
cance. Asin the case of number of illustrations there is a sharp change in 
starting readership from illustrations with females only to illustrations 
showing men. Although there is a preference by the male reader for 
pietures of his own sex only, little was lost by using illustrations with both 
males and females. The higher value for articles having illustrations 
ineluding males suggests that such articles have more preference with 
men and are more likely to be started by the male readers than articles 
which contain women in their illustrations. As pointed out previously, 
however, it may not have been the type of illustration alone but other 
correlated factors that are responsible for the starting readership. The 
subject matter of an article is important for starting readership and the 
kinds of illustrations are undoubtedly dependent upon this variable. 
“No Data” refers to articles with no illustrations or to those with illustra- 


668 Evelyn Perloff 


tions in which there were no people or the sexes of the persons shown 
were not discernible. 

Figure 4 shows the distributions of proportion of opening page(s) de- 
voted to text and starting readership per cent. 


100 


STARTING READERSHIP % INDEX 
t 


40% & 20%-39% 197% & 
Over Under 


Ела, 4. The effect of proportion of opening page(s) devoted to text upon starting 
readership per cent (N = 190) (т = — .16). 


There appears to be an inverse relationship (correlation coefficient 
= — .16) between this variable and how many men will start to read an 
article. It is apparent from Figure 4 that devoting less than 20% of the 
opening page to text results in the highest starting readership. There is 
a clear change from less than 20% of the opening page(s) devoted to 
text and devoting more than 20% to text. The general trend is for start- 
ing readership to improve as the per cent of text on the opening page(s) 
decreases. 

Figures 1-4 show distributions of the four variables which are ob- 
jective; that is they can be defined in only one way. The fifth variable, 
the subject matter of the article, depended upon the judgments of the 
classifiers, which were often at variance. In addition, the specific in- 
terests of the respondents, based so often upon the events of the times, 
may vary considerably. As a result, this variable cannot be considered 
so stable as the preceding ones. 

Figure 5 shows the distributions of subject matter and starting reader- 
ship. There is greater variation among the classes of this variable than 
in any other. The number of cases in any one category was often too 
few to consider the category as a separate class. This eliminated various 
classes which are part of the gamut of subjects upon which Post articles 
are written. These articles were classified under the category, “Other.” 
The relationship (correlation coefficient = .42) indicates clearly that the 


Prediction of Male Readership of Magazine Articles 669 


LJ 
e 
м7 
° 


8 


Li 


STARTING READERSEIP f INDEX 
5 


| : 
F H 8 s 1 
| її TT fis dag 
ig НЕ gale 32 fe ade at dl ef 
get PEP PE ШЕЕ 
8 Рр. Ap за 9 8 
SUBJECT MATTZR 


Ела. 5. The effect of subject matter of article upon starting readership per cent 
(N = 190) (т = 42). 


subject matter of an article considerably influences the male reader to 
start it. In examining Figure 5 it will be apparent that men who read 
The Saturday Evening Post have definite likes and dislikes of Post topics. 
Although there is a steady increase in starting readership from topics 
least liked to those best liked, there are also several sharp changes 
grouping together both similar levels of preferences and similar kinds of 
subject matter. The general trend is for male starting readership to im- 


Table 1 
Intercorrelations between Variables 1-5 and of Starting Readership Per Cent 
(N = 190) 
% Text 
Starting No. Color Sex on 
Reader- of of of Opening Subject 
Variable ship % Шив. Шив. Persons Page(s) Matter 
Starting 
Readership % — 35 .28 .22 – 16 42 
No. of Illus. 35 — 52 —.17 —.14 .20 
Color of Illus. .28 52 — .10 – 42 01 
Sex of Persons 22 —.17 —.10 — 14 31 
Per cent Text оп 
Opening Page(s) – 16 —.14 —.42 14 — е 08 
Subject Matter 42 20 01 31 03 — 


670 Evelyn Perloff 


prove significantly when Post articles deal with topics such as sports, de- 
scription and analyses of campaigns, information about army life and 
personal war adventures. These topics reveal a preference by male 
readers for action-type articles. Articles on health and hygiene and 
general aspects of business offered less attraction to men. 

The Determination of the Composite Effect. The correlation matrix 
is shown in Table 1. The horizontal and vertical headings indicate the 
five variables used in the study. Proportion of opening page(s) devoted 
to text gave the lowest correlation (correlation coefficient = — .16) with 
starting readership per cent, while the coefficient between subject matter 
and starting readership was the highest (correlation coefficient = .42). 

Although at this point we considered individually those variables 


having the highest correlation with the criterion as indicating the best 


articles, when these variables are combined the most valuable variables 
will be identified by the relationship of high correlation coefficients with 
the criterion and low correlations with the other variables. Such a con- 
sideration is automatically taken care of when the multiple correlation 
coefficient is computed. 


Table 2 
Weights of Five Variables for Predicting Starting Readership Per Cent 
(N = 190) (R = .56) 


Variable Weight 
Subject Matter 31 
Number of Illustrations 25 
Sex of Persons in Illustrations 20 
Color of Illustrations 12 
Proportion of Opening Page(s) Devoted to Text —.11 


The five variables studied herein should be considered all together. 
It may not be feasible, however, to achieve the optimum composite due to 
magazine policy or the expense of varying all variables optimally. The 
selection of optimum color or number of illustrations, etc. on the basis of 
means, considering these alone, can make noticeable improvement in the 
resulting starting readership. 

For prediction purposes the regression equation was computed. 
Table 2 shows the weights that each variable obtained. These weights 
are an approximation of the relative independent value of each variable 
to the success of the article. Use of this regression equation yielded a 
correlation coefficient of .56. The standard error of estimate for the R 
was 9.4%. Hence, the chances are that in about 68 out of 100 cases the 
predicted starting readership per cents will be within an error of 10 points 


D 


Prediction of Male Readership of Magazine Articles 671 


orless. We may be certain that very few starting readership estimates 
will be in error by more than 30%. 

Calculation of the coded score weights (weights dependent upon the 
measuring scale of the specific variable) gave the necessary data for the 
regression equation. The final equation is as follows: 


Predieted Starting Readership Per cent Index = 30.4 (Index) + 5.2 
X class value (No. of Illus.) + 1.7 X class value (Color of Illus.) 
+ 4.0 X class value (Sex of Persons in Illus.) + 2.2 X class value 
(Proportion of Opening Page[s] Devoted to Text) + 1.9 X class value 
(Subject Matter). 


An example showing how to obtain the predicted starting readership per 
cent by the regression equation is given following the description of 
Table 3. 

Table 3 shows the class values for each of the five variables, The 
horizontal headings indicate the variables included in the study. The 
characteristies of each variable and their corresponding numerical class 
values are given in the body of the Table? To obtain the predicted 
starting readership for a given article the procedure is: (1) multiply for 
each of the five variables the appropriate class value by the coded score 
weight of the variable indicated in the regression equation; and (2) add 
up these five products plus the regression equation constant, 30.4. 

Take, for example, an article in an issue of the Post which has three 
illustrations in full-color, males only in the illustrations, 25% text devoted 
to opening pages, and whose subject is recreation, team sports. Refer- 
ence to Table 3 gives the following class values respectively for these 
five characteristics of the article, 2, 4, 8, 2, and 7. Substituting these 
values in the regression equation and multiplying by the corresponding 
coded score weights, plus the constant, 30.4, yields a predicted starting 
readership of 77.3%. 

The Cross-validation. To determine the extent to which the weights 
of the characteristics of articles would be valuable in years other than the 
year 1946, when the articles included in this study appeared, we have 
applied this regression equation to 149 articles appearing in the 1947 
issues of the Post. The correlation between the actual and predicted 
starting readership per cents was .36. It was anticipated that this cor- 
relation would be somewhat higher, although it would be expected to be 
lower than the multiple (R = .56) obtained on the validating population. 
The lower correlation predictions in this later year are probably due to the 
change in interests over the period of the year intervening. Insufficient 

2 The author has prepared a circular slide-rule called a Predictograph which has 
interesting attention-getting possibilities and incorporates similar material shown in 
Table 3. 


Table 3 


" Numerical Class Values of Five Variables Used to Predict Starting Readership Per Cent 
(N = 190) (Е = .56) 


| ____-_-____---______-________________- 


Number of Color of Sex of Persons 96 Text on 
Illustrations Illustrations in Illustrations Opening Page(s) Subject Matter of Article 
Num- Class Class Sexof Class 96 Class Class 
ber Value Color Value Persons Value Text Value Subject Value Subject Value 
Business Business 
General Aspects 1 Unusual Jobs 5 
0-2 1 Other 1 Female 1 19% апа 1 Health and Hygiene War and Peace 
Under General Aspects 
Personalities 
Athletes 
Black Male Business Men 
3-7 2 and 2 and 2 2095-3979 2 and Women 2 War and Peace 6 
White Female Statesmen Army Life 
Other 
Social Problems 
U. 8. Government 
Foreign 
8 and Male Relations and Recreation 
Over 3 Duotone 3 and 3 40% and 3 Politics 3 Team Sports 7 
No Data Over 
War and Peace 
Campaigns 
Full- 4 Other 4 Personal 8 
Color Adventures 


Jfoyoq ufipag 


Prediction of Male Readership of Magazine Articles 073 


classes in the subject matter variable may also be responsible for the 
lower correlation. 

The average difference between the actual and predicted starting 
readership per cents was 8.3%. The predicted starting readership per 
cents were within 595 of the actual starting readership in 44% of the 
articles, within 10% in 66% of the articles, and within 15% in 86% of 
the articles, As would have been expected, those articles for which the 
predictions were close, i.e., within 5% were for the most part articles 
which centered around the mean of the actual starting readership per 
cents. The higher or lower an observed starting readership, the poorer 
the prediction. Articles which had no similar precedents were the most 
difficult to predict. These were often the "other" cases. 


The Applications 

The results of this study can be of value as aids to judgments on 
editorial matters. Thus the present regression equation gives the best 
possible prediction of the number of men who will start to read an article 
in the Post when only five easily determined variables are considered. 
Further research is being conducted to determine: (1) the value of these 
in predicting feminine readership; and (2) the value of these and 23 other 
variables in predicting readership. 

The primary application of the present regression equation and the 
simplified methods of computation outlined above lies in checking the 
value of a tentative layout for an article. When the estimated reader- 
ship per cent is below average, changes can be made, prior to publication, 
with an increase in the average readership of each issue of the magazine. 

The primary limitation that should be considered is the change in 
weights, and thus the equation's predictive value, as conditions change 
with time. Such change may result from: (1) the selection of new articles 
from the wide range of “types” of articles which can be written; (2) the 
varying and changing needs, interests, and preferences of a large group 
of readers; and (3) the unpredictability of the events which may occur 
in a given period and which influence readership. This would imply 
continued follow up with further study from time to time. 

Although the present study is specifically pertinent to layout features 
of articles in a magazine, the multiple regression technique can also be 
used for content evaluation, following the search for appropriate and 
predictive variables. Perhaps in time, the results of both layout and 
content analyses may, after adequate and intense investigation of the 
predictive significance of each variable and of the accuracy of their com- 
bined prediction, partially eliminate the need for the regular, current 
surveys. The necessity, however, for constant validation and revision 


674 Evelyn Perloff 


due to changing conditions of time and interests, excludes the practical 
possibility that predictions alone can ever completely supplant the survey 
method, although, if highly accurate, predictions should reduce both 
operating and survey time and expense. 


Summary and Conclusions 


One hundred and ninety articles in The Saturday Evening Post through- 
out 1946 were analyzed in an attempt to predict the number of men who 
would start to read the articles. Five variables believed to be deter- 
minants of starting readership were studied. These variables were: (1) 
number of illustrations; (2) color of illustrations; (3) sex of persons in 
illustrations; (4) proportion of text devoted to opening page(s); amd (5) 
subject matter of the article. The criterion (starting readership per cent) 
was the number of men, who saw the article and started to read it. These 
figures were obtained from readership survey results. 

The multiple correlation technique was followed. Calculation of the 
regression equation permitted prediction of the starting readership per 
cents. To check the validity of this equation, the starting readership 
per cents of 149 articles appearing in the 1947 Post were predicted and 
compared with the actual starting readership per cents obtained by the 
survey method. 

The following conclusions are supported: 


1. The multiple correlation and regression technique proved to be a 
successful method for predicting starting readership of Post articles by 
male readers, 

2. The accuracy of the predictions of future articles should fall within 
a 10% difference between predicted and actual starting readership per 
cents in about 68% of the cases. This percentage error is satisfactory 
for most practical purposes. 

3. The order of the relative importance of the five variables included 
in this study is: (a) subject matter; (b) number of illustrations; (c) sex of 
persons in illustrations; (d) color or illustrations; and (e) proportion of text 
devoted to opening page(s). 


Received May 1, 1948. 


Book Reviews 


Kelly, George A. New methods in applied psychology. College Park, 

Maryland: University of Maryland, 1947. Pp. viii + 301. 

The title creates the illusion of a systematic and comprehensive treat- 
ment. The long sub-title indicates the more modest nature of the volume 
which represents the "Proceedings of the Maryland Conference on 
Military Contributions to Methodology in Applied Psychology held at 
the University of Maryland, November 27-28, 1945 under the auspices 
of the Military Division of the American Psychological Association." 

Тће contributions of psychologists to questions emerging in connection 
with the American war effort covered а wide range of problems. The 
material presented at the Maryland Conference concerns the establish- 
ment of criteria of performance (5 papers), methods for classification of 
military personnel (5), selection of officers (3), training (4), group morale 
(1), psychological research on military equipment (5), statistics and 
theory of psychological measurement (5), relationships between psy- 
chology and psychiatry (2), clinical diagnosis of mental disturbances (3), 
and use of electroencephalography (2 papers). 

A large chapter is devoted to post-war developments in the various 
branches of military psychology: personnel psychology (M. W. Richard- 
son, В. N. Faulkner), aviation psychology (J. С. Flanagan, J. С. Jenkins), 
clinical psychology (M. A. Seidenfeld, W. A. Hunt), and military research 
by civilian psychologists (C. W. Bray, М. 5. Viteles), with a general 
summary and recommendations by D. G. Marquis. 

It is impossible to evaluate here separately each contribution or even 
the larger units. The readers will appreciate the summaries by J. C. 
Flanagan (instruments of measurement), W. R. Miles (engineering for 
human use), L. F. Shaffer (clinical techniques), and M. 8. Viteles (criteria 
of performance) which integrate and round out the individual contri- 
butions to a given area. 

The comment will be limited to а few points. The volume leaves the 
impression that American psychology was methodologically well pre- 
pared to meet the problems posed by the war emergency. This does 
not mean that the specific techniques were always at hand. However, 
the training in sound research methods gave the psychologists a mental 
equipment with which to create the tools for attacking a given problem. 
One area, in particular, which witnessed a creative expansion of the 
study of behavior into a largely new phase of “human engineering" was 

675 


676 Book Reviews 


the field of experimental testing of military equipment. It is hoped that 
the principles and techniques developed in this field (and in other fields 
of military psychology) will find a fruitful use in peace-times in designing 
industrial tools and machines. 

We should like to close the review with reference to the keynote address 
of the conference, *New opportunities and new responsibilities for the 
psychologists,” given by J. G. Jenkins, chairman of the Department 
of Psychology at the University of Maryland and war-time head of the 
Aviation Psychology branch, Bureau of Medicine and Surgery, U. 8. 
Navy. Jenkins stressed that the research psychologist should choose 
problems not so much on account of their methodological neatness as 
because of their promise of returns which are of social importance. This 
means adding the criterion of social significance of a research project to 
the accepted criterion of statistical significance of the results. This is a 
serious challenge to the hundreds of Ph.D. candidates looking every year 
for a subject for their thesis. May it not be a challenge for their ad- 
visers, too? 

Josef BroZek 


Laboratory of Physiological Hygiene, 
University of Minnesota 


DeGruchy, Clare. Creative old age. San Francisco: Old Age Counseling 

Center, 1946. Рр. 143. $2.75. 

Lawton, George. Aging successfully. New York: Columbia University 

Press, 1946. Pp. 266. $2.75. 

Creative Old Age is the last in a trilogy of books on old age counseling. 
Dr. Lillien J. Martin, alert at 91, dying in 1943, began her old age counsel- 
ing in 1921. After nine years she published Salvaging Old Age in 1930, 
which was superseded by Sweeping the Cobwebs in 1933, which is the basic 
book on the Martin method in old age counseling. After Dr. Martin’s 
death her long-time associate and assistant, Mrs. DeGruchy, brought 
out the second book in the trilogy, A Handbook for Old Age Counselors, 
1944, which outlines more clearly for counselors the five counseling 
sessions usual in the Martin technique. 

But during Dr. Martin’s life and afterwards psychologists and coun- 
selors kept asking for a book of cases or case histories. This, at last, has 
been produced by Mrs. DeGruchy. In these three books, a trained and 
competent counselor who has occasional later maturity or old age cases 
presented to him, now has an old age counseling kit. Meanwhile, Mrs. 
DeGruchy is preparing a life of Dr. Martin and giving much time to 
training old age counselors at the San Francisco center. 


Book Reviews 677 


In Creative Old Age, Mrs. DeGruchy has preferred to give in artistic 
short-story form a dozen case histories, plus an account of two group 
projects. The practicing counselor, with loan or sale copies available, 
might well give it to a prospective or actual counselee to arouse faith and 
expectancy of results through contemplation of these "resurrections," 
as it were. For these are not just ordinary cases. Nearly anyone of 
them would be a great challenge to any counselor. The counselor who 
has mastered the first and second books of the trilogy will find his curiosity 
on actual cases largely satisfied by this book. 

However, only the experience of a score or а hundred cases of his 
own will teach him what can't be put in books. Dr. Martin told the 
writer that she did her first three hundred cases without fee, because, she 
said with a twinkle, *How did I know I had done more good than harm?" 
Dr. Martin felt that her professional task in old age counseling had been 
to develop the technique, leaving it to her successors to multiply old age 
counseling centers. 

Dr. Martin and Mrs. DeGruchy hold the thesis that we should never 
retire or quit, but rather continue alive and living to the end with work 
(for money or for mental wages), with play (preferably active and self- 
expressive) and with rest, as parts of each day. Putting a meaning into 
life by finding a pattern (for revision if desirable), by finding his own best 
work or service expression, and by finding his own best play or recreation 
expression, is the heart of the Martin method in old age counseling. 

Aging Successfully is another book a counselor might loan to a coun- 
selee, but perhaps only to read some special chapter, for the book itself 
is twice as long as Creative Old Age. Many chapters seem very long and 
might well be made into two shorter chapters. “То write a book one 
must accumulate information, arrange it appropriately, and present it in 
an interesting fashion," says Dr. Lawton to people who expect to “тейге 
and write a book,” “just like that.” His quotations, jokes, poems, illus- 
trations, statistics, cases, are such as to make one wish for an index to 
find them again, though this book tries to be popular and not a learned 
volume with many footnotes. 

The fifteen chapters are addressed to all of us who are growing older, 
which means from the cradle up. In fact, Dr. Lawton says, old age 
patterns develop at nursery school age (p. 59). "The book as a whole is 
a book on maturing and later maturity rather than an old age as such. 
It is a book on how to prevent old age rather than on how to cure old 

age. Don’t “grow old gracefully,” says Dr. Lawton, rather “grow old 
. aggressively." 

Many books are now appearing on this subject; more will follow as we 

become aware that 30 per cent of us are over 45, numbering 42,000,000. 


678 Book Reviews 


Soon 50 may be the "deadline" for retirement; we have 25,000,000 over 
50, 13,000,000 over 60, 5,000,000 over 70 and 1,000,000 over 80. Good 
books recently to appear include such as After 70 by Herbert N. Casson, 
the efficiency expert, and The Best Years by Walter B. Pitkin (1946). 
But Aging Successfully is different. For Dr. Lawton has been a specialist 
in adolescence and a professional school psychologist, and since 1936 has 
concentrated on problems of later maturity. He has beena psychological 
advisor for several old folks’ homes and has done much individual later 
maturity and old age counseling. He got his ideas together for popular 
presentation in a Cooper Union lecture course in 1945-46 and this book 
is the outcome. Everywhere, however popular the style, it is evident 
that a psychologist dictated the lines. 

Dr. Lawton spent ten days with Dr. Martin in 1943, just before her 
death, and he pays her high tribute, but he does not pretend to follow 
the Martin methods that she took fifteen or twenty years to evolve. He 
thinks “no special ‘new’ psychology, whether in theory or practice, has 
been invented in order to understand or to deal with the difficulties of 
older people.” His book is just “the application to a hitherto unexplored 
field—middle and later maturity—of universally tested and accepted 
principles of clinical psychology, mental hygiene, education, vocational 
counseling, rehabilitation.” 

Like Dr. Martin, Dr. Lawton in his chapter “Retire to, not from,” 
favors life-long aliveness, adventuring, usefulness, vitality. A novel 
feature is his “Bill of Rights for Old Age” in the form of a radio program 
given by older folk themselves. 

While both these books are on psychology each hints that the base 
problem of old age is even more sociological and economic. ‘The 
Eskimos put their old out on the ice floes. We prolong the lives of our 
old as long as possible but deny them opportunities for useful activity 
and personal enrichment.” Hence, both these books in a way are books 
on vocational counseling for later maturity and old age in a battle against 
an industrial society which, unlike an agricultural or socialized society, no 
longer has three-generation homes. In New York City are one hundred 
employment agencies, but not one of them especially for people over forty. 

Dr. Martin says that feeling unhappy and feeling useless are the tragic 
signs of old age and the emergency calls for old age counseling. If our 
social order were to provide their quota of jobs for those past 40 and half 
time jobs for those fit to work after sixty, this new sense of being wanted, 
needed and useful would alone, in America, be equal to one hundred 
thousand maturity and old age counselors in promoting good mental 


Book Reviews 679 


hygiene and happiness in people past forty, and in taking dread from 
people approaching forty. 


American Institute of Family Relations, 
Los Angeles 


Jucius, M. J., Maynard, H. H., and Shartle, C. L. Job analysis for 
retail stores. Research Monograph Number 37, Bureau of Business 
Research, Ohio State University, 1945. Рр. 65. $2.00. 


This is a “how-to-do-it”’ manual, a description of how a job analysis 
program may be developed in distributive industries. It deals with the 
values, limitations, and procedures in making a job analysis and describes 
how the results of job analysis can be applied to job evaluation in retail 
stores. The main contribution of this monograph is the detailed descrip- 
tion of the job analysis and job evaluation procedures with sample forms 
and instruction sheets. Although certain government publications pre- 
sent job analysis and job evaluation materials at a more reasonable price, 
and the importance of obtaining worker cooperation in the installation of 
these procedures is barely mentioned, the manual serves a useful function 
for the personnel worker in a distributive industry. 

William A. McClelland 


Christopher Ruess 


Brown University 


Mursell, James L. Psychological testing. New York: Longmans, Green 

& Co., 1947. Рр. 449. $4.00. 

Apparently organized as a text for education students with previous 
statistical training who will have little additional instruction on testing 
but who will be called upon to utilize test data frequently, Mursell 
describes his work as a “comprehensive and balanced account of the 
testing movement in psychology.” The material is drawn from areas 
of intelligence, aptitude, personality, interest, attitude and character 
study. His emphasis is on “intelligent comprehension . . . rather than 
detailed account of findings." He also includes a considerable amount of 
standard ID's subject matter, reinforced by a 521 item bibliography—not 
counting manuals from the 93 tests referred to in the text. 

Evaluation should be largely in terms of the author’s own aims, but 
there are some general criticisms which any book should meet. In a re- 
cent publication, recent material is expected. In terms of the tests 
selected, the forms described, the validation studies cited, the content 
is often wanting. The tests for discussion must serve many purposes. 
As examples of the various types they must be either popular, useful, or 
grossly bad. They must satisfy the student requirements of being con- 
versant with tests he is likely to encounter, of being aware of serious 
drawbacks or unusual potentialities in some of them, of knowing how to 


680 Book Reviews 


interpret or when to ask for the tests available. Here again, the choice 
however large is poor. There are glaring omissions, like the Kuder 
Preference Record in the area of interests, the M M PI and the Guilford 
batteries in personality. The test criticisms are often perfunctory, 
occasionally erroneous. Although it is delightful to find how effectively 
he dispenses with Link’s Personality Quotient Test, it is strange to see the 
Humm-Wadsworth actually given considerable acceptance. There is 
no one place where the beginner is systematically instructed in the 
procedures and sources for evaluating tests not included in the text. 

In meeting his own aims, Mursell’s performance is checkered. It 
seems he treats rather well methodology in theory and practice—the 
usual problems of reliability, validation, standardization, ete. but often 
contradicts himself later in discussion of specific tests. Definitions are 
important for beginners, but he seems to be troubled by them; often mere 
reference to Warren’s Dictionary would settle them. Scant attention is 
given to the history of the testing movement, although this would seem an 
excellent vehicle for the large emphasis he places on controversies and 
problems. Gratifyingly, he takes up many other issues—e.g., factor 
analysis, scoring technics, projective methods, limitations and potential- 
ities of psychometrics—expressing opinions which can of course only be 
appraised clearly after more data are available. Students should be aware 
of these issues, and be informed sufficiently to follow the developments as 
the movement grows. The inclusion of the ID’s material is of question- 
able value. Naturally anyone dealing with tests should be familiar with 
these findings, but of inestimably greater worth would be careful guidance 
in how to utilize specifically the data in actual testing and interpreting. 
Mere quotation of researches does not seem to achieve this, and would 
perhaps be more meaningfully presented in a course devoted to differential 
psychology. Nonetheless, the author has recognized a real need to be 
served by a book of this kind; the testing movement will gain real strength 
in proportion to the sophistication instilled in those actually called upon 
to utilize its results. 


W. Grant Dahlstrom 
Ohio Wesleyan University 


Moore, Bruce V., Kennedy, J. Ewing, and Castore, George F. The 
work training and status of supervisors as reported by supervisors in 
industry. Department of Psychology and Management Training 
Service, The Pennsylvania State College, 1946, pp. 31. $1.00. 

This study presents the results of a questionnaire submitted to 873 
super visors in industries throughout Pennsylvania during the months of 

April and May of 1946. Men who were then currently employed as 


Book Reviews 681 


foremen were asked their opinions about their own jobs, foremen and 
supervisors in industry. The areas covered by the questionnaire con- 
cerned the duties considered to be the most important responsibilities of a 
supervisor, the training that was obtained prior to promotion to a super- 
visory position, the training that supervisors felt they should have re- 
ceived, and whether or not the supervisor identifies himself with manage- 
ment or with the employee working force. Certain autobiographical 
information as to age, length of service, both as an employee in the com- 
pany and as a supervisor, and the number of workers supervised was also 
obtained. 

Of the total number of supervisors reporting, 231 statements were 
obtained by means of personal interview and the remaining 642 were 
compiled from unsigned questionnaires mailed directly to Pennsylvania 
State College. In both instances complete anonymity was maintained. 
It is interesting to note that there appears to be no significant difference 
in the results of the personal interview and the questionnaire method. 

Several interesting highlights stand out as a result of this study. In 
defining their job most of the supervisors considered that their prime 
responsibility was to show others how to do the work and secondly a 
knowledge of men and how to keep them loyal and working. The average 
amount of training received by the supervisors was less than one year 
and in no instance did a supervisor report that he felt he had had sufficient 
training for the job. Approximately 60% of the supervisors merely carry 
out orders for management and only 15% help make policies affecting 
their departments. Yet, despite this lack of authority, about 70% of 
these men prefer to consider themselves as a part of management. Their 
major criticism of their job as supervisors seems to center around the fact 
that most of them feel that they have bad management and that manage- 
ment could do a great deal for them by keeping promises and explaining 
policy more completely. 

In conclusion, this study presents a well-ordered statistical summary 
of the present day status of the supervisor in industry. It is also signifi- 
cant in terms of methodology in that there seem to be noreliable differences 
between the information obtained from the questionnaires and from the 


personal interviews. 
Henry L. Sisk 


Stevenson, Jordan and Harrison, Inc. 
Chicago, Illinois 


New Books, Monographs, and Pamphlets 


Books, monographs, and pamphlets for listing and possible review should be sent to 
Donald G. Paterson, Editor, Department of Psychology, University 
of Minnesota, Minneapolis 14, Minnesota 


Family, marriage, and parenthood. Howard Becker and Reuben Hill, 
Editors. Boston: D. C. Heath and Co., 1948. Pp. 829. $5.00. 

Projective techniques. John E. Bell. New York: Longmans, Green and 
Co., Inc., 1948. Рр. 512. $4.00. 4 

Psychology for pastor and people. John S. Bonnell. New York: Harper 
and Brothers, 1948. Pp. 225. $2.50. у 

Christian paths to self-acceptance. Robert Н. Bonthius. New York: 
Columbia University Press, 1948. Pp. 254. $3.25. 

Development of the basic Rorschach score with manual of directions. Char- 
lotte Buhler, Karl Buhler, and D. Welty Lefever. Los Angeles: 
Rorschach Standardization Studies, 4759 Hollywood Blvd., 1948. 
Рр. 190. $3.00. 

The anatomy of melancholy. Robert Burton. New York: M. W. Drexler 
Book Co., 1948. Pp. 1036. $3.48. 

The myth ој the magus. Е. M. Butler. New York: The Macmillan Co., 
1948. Pp.282. $3.75. 

How to supervise people in industry. Eliot D. Chapple and Edmond F. 
Wright. Deep River, Conn.: National Foremen's Institute, Inc., 
1948. Pp.198. $2.50. 

Pupil personnel service. Frank С. Davis, Editor. Scranton, Pa.: Inter- 
national Textbook Co., 1948. Рр. 638. $3.75. 

Educational psychology. Robert A. Davis. New York: McGraw-Hill 
Book Co., Inc. 1948. Pp. 358. $3.00. 

Readings in the history of psychology. Wayne Dennis. New York: 
Appleton-Century-Crofts, Inc., 1948. Рр. 587. $4.75. 

Protecting our children from criminal careers. John В. Ellingston. New 
York: Prentice-Hall, Inc., 1948. Pp. 374. $5.00. 

Fainting: mechanisms and diagnosis. George L. Engel. Springfield, Ill.: 
Charles C. Thomas, Publisher, 1948. Pp. 170. $3.00. 

Dimensions of personality. H. J. Eysenck. New York: The Macmillan 
Со., 1947. Pp.308. $5.00. 

Handbook of job facts. Alice Н. Frankel. Chicago: Science Research 
Associates, 1048. Pp. 148. $3.00. 

Work adjustment in relation to family background. Jeannette G. Friend 
and Ernest A. Haggard. Applied Psychology Monographs No. 16. 
California: Stanford University Press, 1948. Рр. 140. $2.00. 

682 


New Books, Monographs, and Pamphlets 683 


The labor leader. Eli Ginzberg. New York: The Macmillan Co., 1948. 
Pp. 191. $3.00. 

How to rear children in the atomic age. Henry H. Goddard. Mellott, 
Ind.: Hopkins Syndicate, Inc., 1948. Pp. 308. $3.00. 

Child offenders—a study in diagnosis and treatment, Harriet L. Goldberg. 
New York: M. W. Drexler Book Co., 1948. Pp. 230. $3.98. 

Speech and speech disturbances due to brain damage—evaluation and treat- 
ment. Kurt Goldstein. New York: М. W. Drexler Book Co., 1948. 
Рр. 500. $8.75. 

The roots of prejudice against the Negro in the United States. Naomi F. 
Goldstein. Boston: Boston University Press, 1948. Pp. 213. $2.50. 

Compulsion and doubt. E. Gutheil, Ed. New York: M. W. Drexler 
Book Co., 1948. Two vols. $6.35. 

Understandable psychiatry. Leland E. Hinsie. New York: The Mac- 
millan Co., 1948. Pp. 359. $4.50. 

Government and the arts of obedience. William W. Hollister. New York: 
Columbia University Press, 1948. Рр. 139. $2.00. 

Essentials of psychology. Donald M. Johnson. New York: MeGraw- 
Hill Book Co., Inc., 1948. Pp. 490. $3.50. 

Personnel management. Michael J. Jucius. New York: Richard D. 
Irwin, Inc., 1948. Pp. 708. $6.00. 

Theory and problems of social psychology. David Krech and Richard 8. 
Crutchfield. New York: McGraw-Hill Book Co., Inc., 1948. Pp. 
622. $4.50. 

Mental measurement. Sohan Lall. Allahabad, India: Kitabistan, 1948. 
Рр. 88. Rs. 9/-. 

- Therapy through interview. Stanley G. Law. New York: McGraw-Hill 
Book Co., Inc., 1948. Pp. 313. $4.50. 

The people’s choice. Paul F. Lazarsfeld, Bernard Berelson, and Hazel 
Gaudet. New York: Columbia University Press, 1948. Pp. 177. 
$2.75. 

Attitude prediction in labor relations—a test of “undertsanding.” Lester 
M. Libo. California: Division of Industrial Relations, Stanford 
University, 1948. Рр. 15. $1.00. 

- Personality projection: in the drawing of the human figure. Karen Mac- 
hover. Springfield, Ill.: Charles C. Thomas, Publisher, 1948. Pp. 
160. $3.00. 

Psychiatry in a troubled world. William C. Menninger. New York: 
The Macmillan Co., 1948. Pp. 607. $6.00. 

The driving forces of human nature and their adjustment—an introduction 
to the psychology and psychopathology of emotional behavior and volitional 
control. D. T. V. Moore. New York: М. W. Drexler Book Co., 1948. 
Pp. 475. $6.50. 


- 
684 ^ Меш Books, Monographs, and Pamphlets ў 7 


Comparative psychology. Revised edition. Е. A. Moss and others. Ni 
York: Prentice-Hall, Inc., 1948. Pp. 404. $4.50. 

An evaluation of selected schools of nursing. Helen Nahm. Applied 
Psychology Monographs No. 17. California: Stanford Univ 
Press, 1948. Рр. 120. $2.00. 

Social adjustment in old age. Otto Pollak. New York: Social 
Research Council, 1948. Pp. 199. $1.75. 

Art and artist—creative urge and personality development. Otto Rank. 

tan ot M. W. Drexler Book Co., 1948. Pp. 443. $2.49. 

A 


and interest factors in йеп Нагоја J. Rudolph. New 
k: Funk and Wagnalls, 1947. Pp. 119. $7.50. 

Music in relation of employee attitude, piece-work production, and ind 
accidents. Henry C. Smith. Applied Psychology Monographs Мо. 
14. California: Stanford University Press, 1948. Pp. 59. $1.75. 

Therapeutic and industrial uses of music. Doris Soibelman. New York: 
Columbia Univeristy Press, 1948. Pp. 274. $3.00. 

Psychopathology and education of the brain-injured child. A. A. Strauss. 
and L. E. Lehtinen. New York: M. W. Drexler Book Co., p 
Pp. 220. $4.98. 

The Szondi test. L. Szondi. New York: M. W. Drexler Book Co., 1 
Pp.324. $10.25. у A. 

The Szondi test set. L. Szondi. New York: 

A948. 48 test pietures. $11.00. 

Government regulation of industrial relations. George W. Taylor. New _ 
York: Prentice-Hall, Inc., 1948. Pp. 416. $4.00. - 

Human relations in action. Calvin C. Thomason. New York: Prentice- 
Hall, Inc., 1948. Pp. 225. $3.50. 1 

New techniques of happiness. Albert E. Wiggam. New York: Wilfred 
Funk, Inc., 1948. Pp. 352. $3.75. 

Freud and his time. Fritz Wittels. New York: M. W. Drexler Book Со. 

.. 1948. Рр. 451. $2.79. 

Medical hypnosis. Two vols. Lewis В. Wolberg. New York: M. W. 

Drexler Book Co., 1948. Vol. 1, Pp. 380, $5.50. Vol. 2, Pp. 475, 
86.50. Both vols. $11.50. 

Pain. Harold С. Wolff and Stewart Wolf. Springfield, Ш.: Charles С. 
Thomas, Publisher, 1948. Pp. 94. $2.00. 

About the Kinsey report. No. 675. New York: The New es 
Library of World Literature, Inc., 1948. $.25. А 

ончие ој men. OSS Аъбосаінень Staff. New York: Reinhart and 
- Co., Inc., 1948. Рр. 541. $6.50. 


are) 


2 


. Drexler Book Co., 


